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EXECUTIVE  SUMMARY 


Quadratic  time-frequency  representations  (QTFRs)  are  used  in  this  report  to  analyze  the 
time-frequency  structure  of  cetacean  mammal  sounds.  The  nature  of  the  signals  and  their 
properties  were  determined,  and  the  types  of  QTFRs  that  are  best  suited  for  analyzing  these 
biological  signals  based  on  their  properties  were  investigated.  Analysis  of  the  group  delay 
structure  of  the  mammalian  vocal  signals  was  matched  to  the  appropriate  quadratic  time- 
frequency  class  for  proper  signal  processing  with  minimal  skewing  of  the  results. 

This  report  includes  a  discussion  of  the  mammalian  data  recordings  from  the  Woods  Hole 
Oceanographic  Institution  (WHOI)  and  an  introduction  to  quadratic  time-frequency  classes, 
properties,  and  mathematical  structure.  It  also  determines  the  appropriate  computational 
formulations  for  analyzing  underwater  mammalian  vocal  communications  using  quadratic  time- 
frequency  representations. 

There  are  two  main  marine  mammal  sounds:  short  broadband  clicks,  which  are  the  sounds 
used  by  dolphins  and  whales  for  echolocation,  and  whistles,  which  are  continuous,  frequency- 
modulated,  narrowband  signals  used  by  dolphins  and  whales  primarily  for  communication.  It 
has  been  determined  that  the  broadband  clicks  can  be  analyzed  using  constant  time-shift 
covariant  QTFRs,  such  as  the  Wigner  distribution  and  the  spectrogram,  as  these  QTFRs  are 
matched  to  signals  with  constant  or  linear  group  delay  characteristics.  On  the  other  hand,  it  has 
been  determined  that  whistles  have  characteristic  time-frequency  structures  that  may  be  matched 
to  the  generalized  time  shift  preserved  by  the  generalized  warped  QTFRs.  These  QTFRs  can  be 
obtained  by  appropriately  warping  the  constant  time-shift  covariant  QTFRs  of  Cohen's  class  or 
the  affine  class.  For  example,  it  was  observed  that  long-finned  pilot  whales  emit  whistles  with  a 
characteristic  hyperbolic  time-frequency  structure.  Thus,  they  were  well  analyzed  using 
hyperbolic  QTFRs,  since  these  QTFRs  preserve  hyperbolic  group  delay  changes  on  the  analysis 
signals.  The  hyperbolic  QTFRs  are  obtained  by  hyperbolically  warping  corresponding  QTFRs 
of  Cohen's  class.  Due  to  the  large  sampling  rate  of  the  data  provided  by  WHOI,  the  whistles 
extend  over  a  large  number  of  samples,  resulting  in  a  high  computational  intensity  in  the  existing 
algorithms.  Therefore,  short-time  time-frequency  techniques  and  short-time  adaptive  time- 
frequency  techniques  were  used  to  provide  an  adequate  analysis  of  the  whistles. 

A  large  selection  of  sound  files  from  the  WHOI  database  was  analyzed  and  the  time- 
frequency  structure  characterized.  The  time  duration  and  bandwidth  of  the  clicks  and  whistles  of 
various  genera  of  dolphins  and  whales  are  provided  in  this  report. 
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FOREWORD 


The  research  presented  in  this  report  outlines  the  time  frequency  characterization  of 
mammalian  click  and  whistle  sounds.  Analysis  of  the  group  delay  structure  of  the  cetacean 
mammalian  vocal  signals  was  matched  to  the  appropriate  quadratic  time-frequency  class  for 
proper  signal  processing  with  minimal  skewing  of  the  results. 

The  time  duration  and  bandwidth  of  the  clicks  and  whistles  of  various  genera  of  dolphins 
and  whales  are  provided  in  this  report.  These  data  are  useful  as  a  comparison  guide  to  other 
recorded  sonar  data  to  aid  in  the  identification  of  recorded  signals  of  unknown  origin  that  may 
belong  to  ocean  dwelling  mammals  via  the  time  duration  and  bandwidth  structure  of  the  recorded 
signals.  This  information  is  also  usefiil  for  areas  where  mammal  sounds  interfere  with  human¬ 
generated  sonar  transmissions  and  when  noise  characterization  is  required  for  signal  processing. 
The  characterized  mammalian  noise  can  be  useful  for  signal  processing  algorithms  that  may 
benefit  undersea  acoustic  communications.  This  report  also  outlines  the  correct  mathematical 
formulations  for  analysis  of  underwater  mammalian  sounds  with  quadratic  time-frequency 
representations. 
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USE  OF  QUADRATIC  TIME-FREQUENCY  REPRESENTATIONS 
TO  ANALYZE  CETACEAN  MAMMAL  SOUNDS 


1.  INTRODUCTION 


The  analysis  of  marine  mammal  sounds  has  been  a  subject  of  considerable  interest  for 
many  years,  as  marine  mammals  such  as  dolphins  and  whales  have  complicated  acoustics  and 
communications  systems.  Marine  mammal  sound  emissions  can  be  classified  into  two  broad 
categories:  sonar  signals  called  clicks,  used  for  echolocation*'^  and  non-echolocation  signals 
called  whistles,  used  for  communication.*'^'*  Sonar  clicks  are  short-duration,  broadband, 
transient-like  pulses  that  are  used  by  marine  mammals  to  detect,  localize,  discriminate,  and 
recognize  various  objects  of  interest  such  as  prey,  obstacles,  and  predators.*  Marine  mammals 
emit  bursts  of  clicks,  called  click  trains,  when  searching  for  targets.  The  number  of  clicks  and 
the  time  interval  between  clicks  depend  on  various  factors,  such  as  the  distance  of  interest,  the 
difficulty  in  detecting  a  target,  the  presence  or  absence  of  a  target  of  interest,  and  the  mammal's 
expectation  of  finding  a  specific  target.  Whistles  are  long-duration,  narrowband,  frequency- 
modulated,  continuous  tonal  sounds,  often  referred  to  as  squeaks,  squawks,  and  squeals,*  that  are 
not  used  for  active  sonar  searches.  They  vary  in  frequency,  duration,  and  in  the  shape  of  the 
whistle  time-fi'equency  structure,  depending  on  the  nature  of  the  signal's  frequency  modulation. 
The  whistles  are  used  mainly  for  communieation  between  the  animals. 

Marine  mammal  sounds  are  nonstationary  signals,  i.e.,  signals  whose  frequency  content 
changes  with  time;  thus,  they  have  time-frequency  characteristics  that  can  be  seen  by  a  sonogram 
or  spectrogram.*^'*’  The  Fourier  transform  is  of  limited  use  for  the  analysis  of  mammal  sounds, 
since  it  does  not  provide  easily  accessible  information  about  the  time  localization  of  a  given 
frequency  component  in  a  signal.  Analysis  tools  of  choice  are  the  quadratic  time-frequency 
representations  (QTFRs),  as  they  are  potentially  capable  of  displaying  the  temporal  localization 
of  the  signal's  spectral  components.**'’*  Some  well-known  QTFRs  include  the  Wigner 


1 


distribution,**’^^  the  spectrogram,^^’ the  Choi-Williams  distribution,^®’ the  scalogram,*^’^®  and 
the  Altes-Q  distribution.^’'^’  Although  many  QTFRs  have  been  proposed  in  the  literature,  no  one 
QTFR  exists  that  can  be  used  effectively  in  all  possible  applications.  The  choice  of  a  QTFR 
depends  on  a  specific  application  as  well  as  the  QTFR  properties  that  are  desirable  for  that 
application.  Thus,  QTFRs  may  be  classified  based  on  various  properties  that  they  satisfy. 

Some  important  QTFR  properties  include  the  covariance  properties.  A  QTFR  is  said  to 
satisfy  a  covariance  property  provided  that  the  QTFR  preserves  (i.e.,  is  covariant  to)  various 
time-frequency  changes  in  the  signal.  For  example,  if  a  signal  is  shifted  in  time,  then  a  QTFR 
satisfies  the  constant  (nondispersive)  time-shift  covariance  property  if  the  QTFR  preserves  the 
exact  time  shift  information  of  the  signal.'*  A  useful  QTFR  classification  is  based  on  the 
grouping  of  various  covariance  properties  that  the  QTFRs  satisfy.  In  particular,  Cohen's 
class’®’  ’®  consists  of  QTFRs  (with  signal-independent  kernels)  that  satisfy  the  constant  time-shift 
and  the  constant  frequency-shift  covariance  properties.  The  affine  class  QTFRs  ’  ’  ’ 
preserve  constant  time  shifts  and  dilations  (scale  changes)  on  the  analysis  signal.  The  hyperbolic 
class  QTFRs”’  are  covariant  to  hyperbolic  time  shifts  and  dilations  on  the  signal.  Other 
covariant  QTFR  classes  include  the  power  classes,’®"^®  the  exponential  class,'"’''’  and  the  power 
exponential  classes.'*’’ These  QTFR  classes  satisfy  different  covariance  properties  and,  as  a 
result,  are  used  for  different  types  of  applications. 

The  aim  of  this  project  was  to  analyze  the  time-frequency  structure  of  marine  mammal  sounds 
in  order  to  ascertain  the  type  and  properties  of  the  signal.  Various  types  of  QTFRs  were 
investigated  to  determine  those  QTFRs  that  are  best  suited  for  analyzing  biological  signals  based 
on  the  signal’s  properties.  Some  QTFRs  types,  for  example  Cohen's  class  of  time-frequency 
shift  covariant  QTFRs,  have  been  used  to  analyze  dolphin  whistles;'’  '*’  however,  this  report 
provides  analysis  of  other  types  of  QTFRs  that  match  the  characteristic  time-frequency  structure 
of  marine  mammal  sounds  to  a  particular  group  delay  curve  in  the  time-frequency  plane.  The 
MATLAB  code  for  some  of  the  QTFRs  already  exists,  such  as  the  code  for  some  of  Cohen's 
class,  affine  class,  and  hyperbolic  class  QTFRs,  and  new  MATLAB  code  was  written  for  those 
QTFRs  that  are  considered  well  matched  for  analysis  of  marine  mammal  sounds.  The  results  are 
summarized  in  the  following  sections. 
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This  report  is  organized  as  follows:  Section  2  describes  the  marine  mammal  sound  data 
files  provided  by  the  Woods  Hole  Oceanographic  Institution  (WHOl)'*^  and  provides  the 
MATLAB  code  necessary  to  access  the  data  for  processing.  Section  3  discusses  some  well- 
known  QTFR  classes  (Cohen’s  class  and  the  affine  class)  that  preserve  constant  time  shifts  on 
the  analysis  signal.  The  constant  time-shift  covariant  QTFRs,  including  the  Wigner  distribution 
and  the  spectrogram,  are  ideal  for  analyzing  signals  with  constant  or  linear  time-frequency 
characteristics,  such  as  dolphin  and  whale  clicks.  Section  4  presents  the  generalized  time-shift 
covariant  QTFR  classes  that  preserve  the  signal's  changes  in  group  delay.  These  QTFRs  include 
hyperbolic  QTFRs,  such  as  the  Altes  Q-distribution,  and  power  QTFRs,  such  as  the  power 
Wigner  distribution  that  are  better  matched  to  the  frequency-modulated  narrowband  whistles  of 
dolphins  and  whales.  Section  5  describes  the  implementation  technique  of  the  generalized  time- 
shift  covariant  QTFRs,  such  as  power  class  QTFRs,  using  the  warping  approach.  Section  6 
provides  an  analysis  of  sonar  clicks  and  click  trains.  Clicks  can  be  successfully  analyzed  using 
Cohen's  class  QTFRs,  such  as  the  Wigner  distribution  and  its  smoothed  versions,  due  to  the 
click’s  broadband  nature.  Section  7  analyzes  various  dolphin  and  whale  whistles,  and  provides 
tables  to  summarize  the  results. 

Whistles  may  be  matched  to  hyperbolic  class  QTFRs  and  power  class  QTFRs  depending 
on  their  charaeteristic  time-frequency  structure.  As  the  sampling  rate  of  the  sounds  provided  is 
relatively  large,  the  long-duration  whistles  consist  of  many  samples;  as  a  result,  the  present 
algorithms  developed  to  compute  hyperbolic  and  power  QTFRs  are  not  computationally 
efficient.  Thus,  short  time-frequency  techniques  were  developed  to  analyze  longer  sections  of 
the  data  in  an  on-line  fashion  for  both  hyperbolic  and  power  QTFRs. 
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2.  MARINE  MAMMAL  SOUNDS 


2.1  SOUND  DATABASE 

The  marine  mammal  sounds  analyzed  in  this  report  were  provided  by  WHOI’s  SOUND 
database.  This  database  contains  descriptive  information  about  each  of  the  sound  recordings  and 
is  accompanied  by  a  set  of  digital  sound-cut  files  whose  filenames  match  the  text  database 
record.  These  cuts  are  sequences  of  sound  from  the  repertoire  of  vocalizations  of  the  different 
species  of  marine  mammals  that  are  selected  from  the  original  underwater  recordings  and  are 
digitized.  The  digitized  sound  cuts  are  stored  as  independent  files  in  the  tapes  provided,  and  are 
accompanied  by  an  ASCII  file  that  contains  the  corresponding  database  records.  The 
organization  of  the  database  records  in  various  fields,  together  with  the  SOUND  database  field 
description,  is  provided  in  the  Woods  Hole  Oceanographic  Institution  Technical  Report'  WHOI- 
92-31,*  entitled  “SOUND  Database  of  Marine  Animal  Vocalizations  Structure  and  Operations,” 
by  W.  A.  Watkins,  K.  Fristrup,  M.  A.  Daher,  and  T.  Howald.'*^  Thus,  for  each  digitized  file, 
there  is  information  regarding,  for  example,  the  sample  rate,  the  cut  (record)  size,  the 
identification  of  the  vocalizing  mammal,  and  the  species  code  of  the  mammal.  Similar  indexing 
information  is  also  included  in  the  first  512  bytes  of  each  file.  An  example  of  a  record  file  from 
the  SOUND  database  is  provided  in  table  1.  The  record  chosen  is  72007001.kay,  which  consists 
of  clicks  and  whistles  from  striped  dolphins  (Stenella  coeruleoalba).  Also  present  in  the  record  is 
ambient  noise  from  the  ship  or  from  ice  (transient  X),  and  the  other  species  present  during  the 
recording  (sperm  whales  (Physeter  catodon)).  An  explanation  of  the  various  fields  in  the  record 
is  given  in  table  2.  Additional  information"*^  on  the  structure  of  the  database,  with  a  listing  of 
species,  common  mammal  names,  and  geographical  locations  of  the  recordings  are  also  provided 
in  the  report. 


*Two  additional  Woods  Hole  Oceanographic  Institution  Technical  Reports,  WHOI-92-04'*’  and  WHOI-94-13,'^*  were 
also  available  for  reference. 
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Table  1.  Example  of  a  SOUND  Database  Record 


Record  72007001.kay 

RN  72007001  ~ 

CU  1063  B7:00.000  18:00.692 

NC41B 

SR  166666 

CS  1.918 

PL  Pemtek  301/  K-H  L0.02  H50.0 
SC  OTF 

GS  Stenella  coeruleonalba  BD15C 
;  Transient  X 

GA  ANWBD15C  ANWX 
OD  4-Aug-1972  BD15CX 
;  AugBDlSC  1972BD15C 
;  AugX  1972X 

NT  BD15C  X  N37  42’  W73  01  ’.  Clicks  and  whistles.  Water  splashes. 

OS  Physeter  catodon  BA2A 

NA  50  +  BD15C 

GB  Delaware  BD15C  X 

GC  N37BD15C  W073BD15C 

;  N37X  W073X 

AUWAW 


Table  2.  Explanation  of  the  Various  Fields  in  the  SOUND  Database  Record  in  Table  1 


Field  Label 

Example 

RN  72007001 

Retrieval  Number  of  Record  (Year/Tape#/Cut#) 

Year:  1972/  Tape  #:  007/  Cut  #:  001 

CU  1063  B7:00.000  18:00.692 

Cue  or  time  (min: sec)  on  tape  at  signal  end, 

(B)  analyzer  buffer  size  (min:sec),  and 
decimal  time  (sec)  from  the  start  of  the  Kay  buffer 
to  the  cursor  at  the  beginning  of  the  sound. 

NC41B 

#  of  charmels  recorded  (first  digit). 

#  of  channels  multiplexed  (second  digit),  Channel  ID  letters. 

SR  166666 

Sample  rate  (Hz). 

CS  1.918 

Cut  size  (sec). 

PL  Pemtek  301/K-H  L0.02  H50.0 

Playback  recorder/filter  type  settings  in  kHz,  L=low 
(forHP),H=high(forLP). 

SC  OTF 

Signal  class  (S=Signature,  M=Mimic,  V=Variant, 

D=Deletion,  U=Uncharacteristic,  C=Calf),  quality  (1  to  5-best). 
0=Overlap  in  T  (time)  or  F  (frequency) 

GS  Stenella  coeruleoalba  BD15C 

Genus  and  species  of  vocalizing  animal,  scientific  names, 

;  Transient  X 

species  code. 

GA  ANWBD15C  ANWX 

Geographic  location  area  code  (first  3  letters),  species  code. 

OD  4-Aug-1972  BD15CX 

Observation  date  for  original,  species  code,  e.g.,  for  BD15C, 

;  AugBDlSC  1972BD15C 
;  AugX  1972X 

month=August,  year=1972. 

NT  BD15C  X  N37  42’  W73  01’. 

Species  code,  Comments,  Recording  detail. 

Clicks  and  whistles.  Water  splashes. 

os  Physeter  catodon  BA2A 

Other  species  present,  species  codes. 

NA50  +  BD15C 

Number  of  animals  vocalizing,  species  codes. 

GB  Delaware  BD15C  X 

Geographical  location  area  name,  species  code. 

GCN37BD15C  W073BD15C 

Geographical  location  latitude  (N  or  S  and  2  digits) 

;  N37X  W073X 

Geographical  location  longitude  (E  or  W  and  3  digits), 
species  codes. 

AU  WAW 

Author,  originator  of  recording. 
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In  order  to  easily  access  the  various  types  of  marine  mammal  sounds  from  the  SOUND 
database  for  analysis,  a  shorter,  simplified  database  was  created  for  easier  access  to  the  required 
information.  For  each  digitized  data  record  chosen,  this  database  contains  the  retrieval  number 
of  the  record  and  the  various  types  of  sounds  present,  together  with  the  common  names  and 
species  codes  of  the  marine  mammals  that  created  the  sounds.  For  example,  the  entry  in  the  new 
database  for  the  record  in  table  1  that  corresponds  to  striped  dolphin  sounds  is  simply: 

72007001. kay  -  Striped  dolphin  BD15C,  transient  X 
(other  species;  sperm  whale  BA2A) 

Clicks  and  whistles.  Water  splashes. 

Based  on  the  simplified  database,  the  main  marine  mammals  present  in  the  sound  cuts 
were  grouped  together  with  some  types  of  recorded  sounds,  as  outlined  in  table  3.  Some  of  the 
more  interesting  artifacts  in  the  data  include  a  12-kHz  pinger,  water  splashes  and  droplets, 
reverberation,  ship  engine  noise,  and  hydrophone  bumps.  It  is  also  possible  to  audibly  observe 
the  data  and  to  hear  the  time-frequency  structure  of  the  whistles  that  are  present.  In  order  to 
access  the  data  from  the  digitized  files,  a  MATLAB  function  (see  section  A.  1  in  the  appendix) 
was  written  to  extract  the  header  of  each  file  and  read  the  data. 
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Table  3.  Marine  Mammals  and  Some  Typical  Sounds  Present  in  the  Database 
Provided  by  the  Woods  Hole  Oceanographic  Institution 


Striped  Dolphin  (Stenella  coeruleoalba) 

Other  mammal  present:  Sperm  Whale  (Physeter  catodon) 

•  Clicks  and  whistles.  Water  splashes. 

•  Clicks  and  whistles. 

•  Clicks,  pulsed  clicks,  and  whistles. 

•  Clicks  and  whistles.  Water  droplet. 

•  Clicks  and  whistles.  About  4  sp.  whale  clicks  present  on  cut. 


White-Sided  Dolphin  (Lagenorhynchus  acutus) 
Other  mammal  present:  Finback  Whale 
_ (Balaenoptera  physalus) _ 

•  Clicks,  pulsed  click  burst  1 2-kHz  pinger  present. 

•  Clicks,  pulsed  click  burst. 

•  Clicks. 

•  Clicks  and  whistles. 

•  Clicks  and  whistles.  Reverberation. 

•  Clicks,  pulsed  clicks.  Reverberation. 

•  Clicks  and  a  whistle.  Maruffa  ship  noise-crack. 

•  Clicks  and  3  whistles.  Reverberation. 

•  Clicks.  Reverberation  present. 

•  Clicks,  pulsed  clicks,  and  a  whistle. 


Spotted  Dolphin  (Stenella  attenuata) 

•  Whistle.  Water  splash. 

•  Clicks. 

•  Whistle  and  clicks. 

•  Clicks  and  whistle.  Water  splash. 

•  Clicks,  pulsed  clicks,  and  whistle. 

•  Clicks,  pulsed  clicks,  and  whistle. 

•  Clicks;  pulsed  clicks. 


Table  3.  Marine  Mammals  and  Some  Typical  Sounds  Present  in  the  Database 
Provided  Provided  by  the  Woods  Hole  Oceanographic  Institution  (ConPd) 


Spotted  Dolphin  (Stenella  attenuata) 

Other  mammals  present:  Humpback  Whale  (Megaptera  novaeangliae) 
_ and  Sperm  Whale  (Physeter  catodon) _ 

•  Many  overlapping  clicks;  whistle. 

•  Ship’s  engine  noise  throughout. 

•  Many  overlapping  clicks,  ship. 

•  Many  overlapping  clicks;  pulsed  clicks;  whistle  engine. 


Long-Finned  Pilot  Whale  (Globicephala  melaena) 
_ Other  mammal  present;  Sperm  Whale  (Physeter  catodon) 

•  Fast  click  series. 

•  Fast  click  series.  Ship  noise  (clunk). 

•  Clicks  and  whistles.  Water  splash. 

•  Clicks  and  whistles. 

•  Clicks,  pulsed  clicks,  and  whistles. 

•  Clicks,  burst  pulsed  clicks,  and  whistles. 

•  Sperm  whale  clicks.  Pilot  whale  clicks  and  whistles.  Hydrophone  bumps. 

•  Pilot  whale  clicks  and  whistles.  Hydrophone  bumps. 

•  Sperm  whale  clicks.  Pilot  whale  clicks,  pulsed  clicks,  and  whistles.  Splashes. 

•  Sperm  whale  clicks  -  skipped  beats.  Pilot  whale  clicks  and  whistles. 

•  Pilot  whales  -  clicks,  fast  series  of  clicks,  and  whistles. 

•  Whistles. 

•  Chirps,  whistles.  Hydrophone  bumps. 

•  Whistles.  Hydrophone  bumps. 

•  Chirps. 

•  Whistles.  Water  splash. 

•  Whistles  and  clicks. 

•  Whistles,  pulsed  clicks,  chirps.  Hydrophone  bumps. 

•  Chirps;  pulsed  clicks. 
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2.2  CLICK  AND  WHISTLE  SOUNDS 


Time-frequency  techniques  were  used  in  this  report  to  analyze  clicks  and  whistles  that 
were  present  in  the  data  provided  by  WHOI.  These  were  the  main  sounds  produced  by  dolphins 
and  whales;  some  characteristics  of  the  sounds  are  given  below. 


•  Clicks:  Clicks  are  sonar  signals  that  are  used  by  marine  mammals  for  echoloc- 

ation.*"’  Using  its  active  and  passive  sonar  capabilities  in  the  form  of  click  sounds,  a 
marine  mammal  such  as  a  dolphin  or  a  whale  can  effectively  probe  its  underwater 
environment  for  the  purpose  of  navigation,  obstacle  and  predator  avoidance,  and 
prey  detection.  Clicks  consist  of  short-duration,  broadband,  transient-like  pulses  that 
are  used  by  marine  mammals  to  detect,  localize,  discriminate,  and  recognize  various 
objects  of  interest  such  as  prey,  obstacles,  and  predators.  Experiments  have  shown 
that  a  click  has  a  short  duration  of  40  -  600  p-s  and  a  3-dB  bandwidth  of  4  -  60  kHz, 
depending  on  the  species  of  the  marine  mammal.^  For  example,  the  Pacific 
bottlenose  dolphin  has  a  3-dB  bandwidth  of  30  -  60  kHz.  As  the  data  sampling  rate 
is  166.666  kHz  for  the  data  provided,  the  broadband  clicks  extend  over  the  full 
bandwidth  of  the  data.  Marine  mammals  emit  bursts  of  clicks,  called  click  trains, 
when  searching  for  targets.  The  number  of  clicks  and  the  time  interval  between 
clicks  depend  on  various  factors,  such  as  the  distance  of  interest,  the  difficulty  in 
detecting  a  target,  the  presence  or  absence  of  a  target  of  interest,  and  the  mammal’s 
expectation  of  finding  a  specific  target.  For  example,  it  has  been  recorded  that  for 
the  Atlantic  bottlenose  dolphin,  the  lag  time  between  clicks  ranged  from  2.5  ms  to  20 
ms  for  a  target  range  of  0.4  m  to  40  m.*  An  example  of  a  single  sonar  click  in  the 
time  domain  of  a  white-sided  dolphin  from  data  file  7500101 2. kay  is  shown  in  figure 
1(a).  The  click  occurs  at  approximately  1.2396  s  and  has  a  duration  of  about  60  ps. 
The  Fourier  transform  (frequency  domain)  of  the  data  segment  is  shown  in  figure 
1(b),  where  the  spectrum  has  large  magnitude  values  over  a  broad  range  of 
frequencies.  Indeed,  the  click  is  of  short  duration  and  is  relatively  broad  band. 
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Whistles:  Whistles  are  non-echolocation  signals  that  are  predominantly  emotional 
and  communication  sounds.^'*'*  They  are  long-duration,  narrowband,  frequency- 
modulated,  continuous  tonal  sounds  that  are  not  used  for  active  sonar  searches. 
Whistles  vary  in  frequency,  duration,  and  in  the  shape  of  the  whistle  time-frequency 
structure  depending  on  the  nature  of  the  signal’s  frequency  modulation.  For  example, 
the  frequency  range  of  the  dolphin  whistles  varies  from  5  kHz  to  15  kHz,  and  their 
duration  varies  from  0.5  s  to  2  s.  The  whistles  are  used  mainly  for  communication 
between  the  mammals  in  order  to  establish  vocal  contact.  There  are  disputes  as  to 
whether  each  signature  whistle  identifies  one  animal  from  another.  Studies  have 
shown  that  dolphins  in  isolation  have  a  characteristic  signature  whistle.  When 
dolphins  are  in  groups,  however,  they  make  various  types  of  whistles,  as  they  may 
have  the  ability  to  mimic  each  other’s  signature  whistle.^  The  frequency  modulation 
in  the  whistle  has  a  large  variation  between  the  different  mammals,  as  the  dominant 
frequency  as  a  function  of  time  changes  in  various  ways.  The  characteristic  whistle 
time-frequency  structure  may  show  a  rise  or  a  fall  in  frequency,  it  may  be  flat  (no 
change  in  frequency),  it  may  be  falling  hyperbolically  (as  1 /frequency),  or  it  may  be 
changing  sinusoidally  or  as  a  function  of  some  power.  In  many  cases,  the  time- 
frequency  structure  may  consist  of  a  combination  of  various  frequency  modulation 
changes.  As  a  result,  the  whistle’s  group  delay  time-frequency  structure  may  change 
from  mammal  to  mammal.  Various  classes  of  quadratic  time-frequency 
representations  are  examined  in  this  report  for  use  as  analysis  tools  for  clicks  and 
whistles. 
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3.  CONSTANT  TIME-SHIFT  COVARIANT  QTFRS 


3.1  COHEN’S  QTFR  CLASS 

A  QTFR  of  a  signal  x(t),  with  Fourier  transform  defined  as  X{f )  =  is 

a  two-dimensional  function  of  time  t  and  frequency  /  denoted  as  {t,  /) .  Cohen’s  class^*’’ 

contains  all  QTFRs,  denoted  as  /)  (the  superscript  (C)  in  the  QTFR  indicates  that  the 
QTFR  is  a  member  of  Cohen’s  class),  that  satisfy  the  time-shift  covariance  and  the  frequency- 
shift  covariance  properties. The  time-shift  covariance  property  is  defined  as 

(S^XMf)  =  =  rf '((-11,  /),  (1) 

where  is  an  operator  that  causes  the  signal  to  be  shifted  in  time  by  an  amount  q,  and  the 
double  arrow  implies  the  transformation  from  the  analysis  signal  to  the  QTFR.  Note  that 
Tg^^  (t,  f )  stands  for  with  Y{f)  =  v)  (/) .  Thus,  equation  ( 1 )  states  that  if  a  signal 

X{f)  is  time-shifted  by  an  amount  q,  then  a  Cohen’s  class  QTFR  {t,  /)  is  also  shifted 

along  the  time  axis  by  the  same  amount  T^x^it  -rj,  f).  The  frequency-shift  covariance  property 
is  defined  as 

(M,X)(fl  =  X(f-v>  ^  tS'JIS  =  T?(l,f-v),  (2) 

where  is  an  operator  that  causes  the  signal  to  be  shifted  in  the  frequency  domain.  Both  the 

time-shift  and  frequency-shift  covariance  properties  are  important  in  applications  where  the 
signal  needs  to  be  analyzed  at  all  time-frequency  points  with  fixed  time-frequency  resolution. 

As  a  result,  the  QTFRs  in  Cohen’s  class  exhibit  analysis  characteristics  that  do  not  change  with 
time  and  frequency  (constant  bandwidth  analysis),  and  preserve  the  signal’s  time  and  frequency 
shifts.  This  is  important  in  applications  such  as  speech  analysis,  narrowband  Doppler  systems, 
and  multipath  environments. 
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Based  on  these  two  covariance  properties,  any  member  of  Cohen’s  class  can  be  written  as 


J-OO  J-OD 


ro(.^Y/-/.  v)UJf,v)eJ^'^’^dfd>, 

J-00  J-oo 


(3) 

(4) 


where  v)  =  f  -  f  is  a  two-dimensional  function  (or  kernel)  that  uniquely 

\  2  2) 

characterizes  the  Cohen’s  class  QTFR  and  the  signal  product  is  defined  as 


^  vV..r. 


The  Wigner  distribution**’  is  an  important  member  of  Cohen’s  class,  as  it  satisfies  many 
desirable  QTFR  properties.  It  is  defined  as 


WD,(t,fii  r x{f+'^X'{f-Ae>^dv=ru,(f,v)e‘‘"dv. 

j-oo  2)  y  2)  •»-* 


(5) 


and  has  the  kernel  (/  v)  =5  (/)  in  equation  (4).  The  Wigner  distribution  exhibits  high 

time-frequency  localization  for  signals  such  as  sinusoids,  impulses,  and  linear  chirps.  However, 
due  to  its  quadratic  nature,  the  Wigner  distribution  suffers  from  interference  or  cross  terms  when 
multicomponent  signals  are  analyzed.  Cross  terms  may  impede  signal  analysis  depending  on  the 

N 

application.  In  particular,  the  Wigner  distribution  of  an  N  component  signal  X{/)  -  ^^/(/) 

/=! 

is  given  by 


N  p~\ 


WDydJ)  =  YWD,(I.J)  +  2'£Y^Re{WD^,.^,m. 

/=I  P-1  q-\ 


(6) 
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where  ,  {t,f)  is  the  cross-Wigner  distribution  of  Xp{f)  and  X^{f)  (defined  in 

equation  (5)),  with  X  and  X*  replaced  by  Xp  and  X* ,  respectively/^  Thus,  the  Wigner 

distribution  of  the  multicomponent  signal  X{f)  consists  of  the  sum  of  N  auto  terms  and  the  sum  of 
N(N-1) 

— ^  cross  terms. 


In  practical  applications,  the  cross  terms  in  equation  (6)  can  be  attenuated  by  smoothing 
the  Wigner  distribution.  Note  that  any  member  of  Cohen’s  class  can  be  written  as  a  smoothed 
Wigner  distribution  using 


=  r  r  y^f^(t-i,f-})WD,ft,})di4, 

^00 


(7) 


where  f)  =  (f,  v)e  .  Thus,  the  cross  terms  in  equation  (6)  can  be  reduced  by 

choosing  the  kernel  f)  in  equation  (7)  accordingly.  Specifically,  the  spectrogram^^’ 
SPECx(t,  f)  uses  an  analysis  window  Y(f)  to  reduce  cross  terms.  The  kernel  of  the  spectrogram 
in  equation  (7)  is  the  Wigner  distribution  of  the  smoothing  window  T(J),  i.e., 
vfpEc  (^’f )  -  '/)■  Inserting  this  kernel  in  equation  (7),  the  spectrogram  can  be  defined 

as  a  smoothed  Wigner  distribution 


SPECJt,fi=  [^[WDrft-t,f-f)WD,(i.})didf  = 


(8) 


Note  that  equation  (8)  shows  that  the  spectrogram  can  also  be  defined  as  the  squared  magnitude 
of  the  short  time-Fourier-transform,  linear  time-frequency  representation.^^ 
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As  the  smoothing  kernel  of  the  spectrogram  is  the  Wigner  distribution  of  a  window,  the 
spectrogram  cannot  simultaneously  provide  a  small  amount  of  smoothing  in  both  the  time  and 
frequency  directions.  On  the  other  hand,  the  smoothed  pseudo-Wigner  distribution  uses  two 
windows  to  provide  independent  time  and  frequency  smoothing.  The  smoothed  pseudo-Wigner 
distribution  SPWDx{t,f)  uses  a  separable  kernel  equation  (7),  where 

g{t)  and  H(/)  are  two  analysis  windows;  it  is  defined  as 

SPWDJtJ)  =  f  [  g(t-t)H(f-f)WD,(t,f)didf.  (9) 

J-oo  J-oo 


3.2  AFFINE  QTFR  CLASS 

The  affine  class'^’  contains  all  QTFRs  ( that  satisfy  the  time-shift 

covariance  property  in  equation  (1),  and  the  scale  covariance  property  defined  as 


(C„X)(f)  = 


A 

at,— 

a) 


(10) 


The  scale  covariance  property  is  important  for  multiscale  analysis,  scale  covariant 
systems,  the  wideband  Doppler  effect,  and  detecting  short-duration  transients.  Thus,  affine 
QTFRs  are  used  in  applications  such  as  image  enlargement  and  data  compression,  where  it  is 
desirable  to  preserve  dilations  or  scale  changes  on  the  analysis  signal.  Many  affine  QTFRs  are 
used  for  constant-Q  time-frequency  analysis,  where  the  analysis  bandwidth  is  proportional  to  the 
analysis  frequency.  This  provides  an  alternative  to  the  constant  bandwidth  analysis  achieved  by 
Cohen’s  class  QTFRs. 

Based  on  the  time-shift  covariance  property  and  the  scale  covariance  property,  any  affine 
class  QTFR  can  be  written  as 
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U^(f,v)e^^^'^dfdv, 


(12) 


where  [  - 6  + — ,  -b-—  is  a  two-dimensional  kernel  that  uniquely 

V  2  2  ) 

characterizes  the  affine  QTFR  and  the  signal  product  Ux  if,  v)  that  is  defined  in  equation 
(4).  The  Wigner  distribution  in  equation  (5)  is  also  a  member  of  the  affine  class.  Another 
important  member  of  the  affine  class  is  the  scalogram,  which  is  defined  as  the  squared  magnitude 
of  the  wavelet  transform.*^’ 
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4.  GENERALIZED  TIME-SHIFT  COVARIANT  QTFRs 


The  generalized  time-shift  covariant  QTFR  consist  of  QTFRs  that 

are  covariant  to  frequency-dependent  time  shifts  x(f)  that  can  be  selected  to  match  the  group 
delay  function  of  the  analysis  signal.  The  generalized  time-shift  covariance  property  is  important 
for  analyzing  signals  passing  through  systems  with  dispersive  characteristics  corresponding  to 
specific  group  delay  functions.  For  QTFR  T^-  the  generalized  time-shift  covariance 
property  is  defined  as 


=  TJt  -  CT(f).f), 


(13) 


where  the  generalized  time-shift  covariance  operator  is  given  by 


(Di^>X)(f)  = 


Here,  ^(6)  is  a  one-to-one  phase  function,  the  time  shift  or  group  delay  r  (/)  =  — ^ 

df 


'X 

\fr  J 


(14) 


is  the 


derivative  of  the  phase  function,  and  ^  >  0  is  a  fixed  reference  frequency.  Different  generalized 
time  shifts  are  obtained  by  fixing  ^(6)  and  x(/')  in  equations  (13)  and  (14).  Some  special  cases  of 
generalized  time  shifts  include: 


•  constant  (nondispersive)  time  shifts  in  Cohen’s  class  or  the  affine  class  when 

1 


^{b)  =  b  and  x{f)  = 


fr 


hyperbolic  time  shifts  in  the  hyperbolic  class  when  '^{b)  -lnb(b>0)  and  x(f)  =  y 


Note  that  equation  (14)  simplifies  to  in  equation  (1)  when  ^(b)  =  b  and  q 


fr 
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•  Kth  power  time  shifts  in  the  Kth  power  class,  k  0,  when  =  ^k(^)  = 
sgn(Z;)|6|'^, where  sgn(Z>)  is  -1  for  6  <  0,  1  for  b>0,  and  0  for  ^  =  0,  and 


^(/)  =  (/)  = 


K 

f 

fr 

fr 

1  - 

exponential  time  shifts  in  the  exponential  class  when  ^  (b)  =  e  and  r(/)  =  — , 

f r 


Thus,  generalized  time-shift  covariant  QTFRs  unify  existing  QTFR  classes,  such  as  Cohen’s 
class,  the  affine  class,  the  hyperbolic  class,  the  power  classes,  and  the  exponential  class.  They 
are  important  QTFR  classes  for  analyzing  signals  passing  through  dispersive  systems  and  are 
specifically  suited  for  signals  whose  group  delay  is  proportional  to  the  time  shift  x(/’). 


Generalized  time-shift  covariant  QTFR  classes  can  be  obtained  by  warping  existing 
constant  time-shift  covariant  classes  such  as  Cohen’s  class  and  the  affine  class.^®’^^’'*^’'*^’^®'^^ 
The  form  of  the  warping  is  fixed  by  the  choice  of  ^(6)  in  the  desirable  generalized  time-shift 
covariance  property  in  equations  (13)  and  (14).^®’  The  QTFR  warping  is  given  by 


rp(G  class) ^  _  nn{class^ 


t 

1^' 


(15) 


where  the  generalized  time-shift  warping  operator  is  defined  as 


(W,X)(fi  = 


1 


^X\ 


\frJJ 


(16) 


Here,  the  inverse  function  ^’’(6)  is  such  that  ^'*(^(6))  =  b  and  ^{b) 


.  The  superscript 


(class)  indicates  which  QTFR  class  undergoes  the  warping  in  equation  (15).  When  (class)  —  (C), 
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corresponding  to  Cohen’s  class,  then  Cohen’s  class  QTFRs  /)  are  warped  using  equation 

(15)  to  obtain  the  generalized  warped  Cohen’s  class  QTFRs  (/,/).  Similarly,  when 

(class)  =  (A),  corresponding  to  the  affine  class,  then  affine  class  QTFRs  f )  are  warped 

using  equation  (15)  to  obtain  the  generalized  warped  affine  class  QTFRs  /).  The 

generalized  warped  QTFRs  always  satisfy  the  generalized  time-shift  covariance  in  equation  (13) 

for  a  given  one-to-one  function  \{h)  because  the  warping  maps  the  constant  time-shift  operator 

S  ^  in  equation  (1)  of  Cohen’s  class  or  the  affine  class  to  the  generalized  time-shift  operator 
7r 

in  equation  (14)  using  where  is  such  that  =  X(f).  The 

7r 

generalized  warped  QTFRs  also  satisfy  an  additional  covariance  property  that  depends  on  h,(b).^^ 


4.1  GENERALIZED  WARPED  COHEN’S  CLASS 

4.1.1  Generalized  Warped  Cohen’s  Class  Formulation  and  Covariance  Properties 

The  generalized  warped  Cohen’s  class  (GC)  is  obtained  by  warping  Cohen’s  class 
QTFRs  ’  using  equation  (15)  with  (c/a55)  =  (Q.  Applying  the  warping  in  equation  (15)  to 
Cohen’s  class  QTFR  formulation  in  equations  (3)  and  (4),  any  GC  QTFR  can  be  expressed 

36,38,41,44,  50,51 

as 


=  f  f  T; 


(GC) 


\frj 


-k 


r  (  f\ 


A 

\frj 


\frj 


yfrj 


Vk/lX/2.>| 


'X(f,)ir(A)e  df,df„ 


(17) 


=  ff^r 


(GC) 


f  ff\ 


\ 


\frj 


J2n- 


bjj  V^(b,p)e  ^(fydbdp. 


(18) 
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where  b-  —  ,  b  +  —  |  is  a  two-dimensional  kernel  function  that  uniquely 

V  2  2) 

characterizes  the  GC  QTFR  The  signal  function  V^{b,  p)m.  equation  (18)  is  given  by 


v,(h.(l)  =  f,Uw.x(f,l>,fJ>  =  fr(n'iX){f.b 


■J 


V 


■J 


-fr 


f 

r' 

b  +  — 

\ 

f 

( 

h- 

-fi] 

~2 

X 

( 

fri~' 

* 

X 

f 

1 

V  2/ 

/ 

K 

\ 

V  2yJ 

\ 

\  ^  )  J 

,  (19) 


where  the  signal  product  U ^{f,v)  is  given  in  equation  (4)  and  the  warped  signal  W^X[/)  is 

defined  in  equation  (16).  Note  that  the  GC  formulation  in  equations  (17)  and  (18)  generalizes 
Cohen’s  class  since,  when  ^(Z>)  =  b,  equations  (17)  and  (18)  simplify  to  Cohen’s  class 
formulation  in  equations  (3)  and  (4)  with  > /r^2 ) 

^[^^\b,p)  {f,b,f,p),  respectively. 


Due  to  the  warping,  the  constant  time-shift  covariance  property  in  equation  (1)  and  the 
frequency-shift  covariance  property  in  equation  (2)  in  Cohen’s  class  are  transformed  into  two 
new  covariance  properties  in  the  GC.  As  a  result,  any  GC  QTFR  {t,  f )  satisfies  the 
generalized  time-shift  covariance  property  in  equations  (13)  and  (14),  and  the  generalized 
warped  frequency-shift  covariance  property  defined  as 


\'^(fry(fy)) 

Mf) 


>fry(f'^) 


(20) 


where  y{f,v)  = 


C  (  f\ 

V  Uy 


The  covariance  property  in  equation  (20)  follows  from  the 


mapping  of  the  frequency  shift  operator  (My  in  equation  (2))  in  Cohen’s  class  to 
which  transforms  the  signal  as 
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(y?x)(f)  = 


x(fMfy))- 


(21) 


The  covariance  property  in  equations  (20)  -  (21)  may  or  may  not  be  useful  in  a  particular 
application,  depending  on  the  choice  of  ^(Z)).  For  ^(6)  =  In  6  and  v  =  frlna,  the  covariance 
property  in  equation  (20)  simplifies  to  the  scale  covariance  property  in  equation  (10),  which  is  an 
important  property  for  multiresolution  analysis.  Furthermore,  the  operator  in  equation  (21) 
simplifies  to  the  scale  operator  in  equation  (10),  since  (y/"L^)(/)  =  (Ca^)(/) 
corresponding  GC  is  the  hyperbolic  class^'^’  (see  table  4). 


4.1.2  Generalized  Warped  Cohen's  Class  Members 

GC  QTFRs  can  be  obtained  by  warping  Cohen’s  class  QTFRs  using  equation  (15),  and 
are  defined  by  fixing  the  kernel  y?)  in  equation  (18).  An  important  GC  member  that 

satisfies  many  desirable  QTFR  properties  is  the  generalized  warped  Wigner  distribution 
(GWWD)  WD^p{t,  f ) .  The  GWWD  is  obtained  by  warping  the  well-known  Wigner  distribution 
in  equation  (5)  using  equation  (15).  Thus,  the  GWWD  is  defined  as 


WDf(t,f)  = 


{  j 

\ 

KfrJj 

yfr  ) 

) 

j2K~ — ^ 

e  ^0)  dp . 


(22) 


and  its  kernel  in  equation  (18)  is<I>^|;|  {b,  P)  =  5  (b) .  Note  that  equation  (22)  simplifies  to  the 

Wigner  distribution  in  equation  (5)  when  ^(b)  =  b  and  x(f)  =  — .  The  GWWD  provides  good 

f r 

time-frequency  resolution  for  various  analysis  signals^^  and  is  ideally  matched  to  signals  whose 
group  delay  equals  the  generalized  time  shift  that  is  preserved  by  the  GWWD.^^’  The 
generalized  impulse  is  a  nonstationary  signal  defined  in  the  frequency  domain  as 
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(23) 


^  ^  f  -f^ 

with  phase  function  i{b)  and  group  delay  function  ct(/)  =  c — ^ 


When  the  one-to-one 


df  y/rj 

function  ^{b)  of  the  GWWD  equals  the  phase  function  of  the  generalized  impulse,  the 
GWWD  of  the  generalized  impulse  is  simply  a  Dirac  delta  function  centered  along  the  impulse’s 
group  delay,  i.e.. 


WD%(t,f)  =  \:(fj\&(t-cz(f))  if,  and  only  if,  %(b)=%(b). 

^  C 


(24) 


Thus,  the  GWWD  is  ideally  matched  to  generalized  impulses  as  it  provides  high  time-frequency 
localization  along  the  signal’s  group  delay  characteristics.  Just  like  the  Wigner  distribution,  the 
GWWD  suffers  from  cross  terms  when  multicomponent  signals  are  analyzed.  For  example,  if 
the  signal  consists  of  the  sum  of  two  generalized  impulses  X(/)  =  +  /J*(/),  then  the 

GWWD  results  in  the  sum  of  two  auto  terms  and  one  cross  term.  The  auto  terms  {t,  f ) 
and  WD%  {t,  f )  are  defined  in  equation  (24).  The  cross  term 


(t,f)  =  2co5 


2n(c^ 


v/r  JJ 


Ci+C2 


,  ^(f) 

^  J 


(25) 


is  centered  along  the  group  delay  curve  t  =  t(/ )  and  oscillates  at  a  frequency  that 


increases  as  the  parameter  difference  {c\  -  ci)  increases. 
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In  many  practical  applications,  it  is  important  to  reduce  cross  terms  by  means  of 
smoothing  the  GWWD.  Specifically,  any  member  of  the  GC  ean  be  written  as  a  smoothed 

version  of  the  GWWD  (t,/)  in  equation  (22))  using  a  smoothing  kernel  ,  i.e.. 


=  f  r  V 


(GC) 

T 


^(f)  ^(f) 


\frj 


\J  r  J  J 


wD<}>(i.f)didf, 


(equation  (18)).  A  GC QTFR  that  applies 

J-co 

smoothing  to  reduce  cross  terms  in  the  time-frequency  plane  is  the  generalized  warped 
spectrogram  that  is  obtained  by  warping  the  spectrogram  in  equation  (8)  using  equation  (15). 
This  generalized  warped  spectrogram  uses  a  window  to  reduce  cross  terms.  More  smoothing  is 
provided  by  the  generalized  warped  smoothed  pseudo-Wigner  distribution  that  is  obtained  by 
warping  the  smoothed  pseudo-Wigner  distribution  in  equation  (9)  using  equation  (15).  This  is 
because  the  generalized  warped  smoothed  pseudo-Wigner  distribution  uses  two  windows  that 
can  smooth  out  cross  terms  independently  along  the  time  direction  and  along  the  group  delay 
t(/)  direction. 


4.1.3  Generalized  Warped  Cohen’s  Class  Examples 

Different  GC  QTFR  classes  can  be  obtained  simply  by  choosing  a  one-to-one, 
differentiable,  and  invertible  function  ^(fe)  in  equations  (17)  or  (18).  Depending  on  ^(h),  the 
covariance  properties  in  equations  (13)  and  (20)  may  be  useful  depending  on  the  analysis 
application  at  hand.  Examples  of  the  generalized  warped  Cohen’s  class  are  the  hyperbolic  class 
and  the  Kth-power  warped  Cohen’s  class;  these  are  listed  below  and,  together  with  some 
additional  examples,  are  summarized  in  table  4. 
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1 .  Hyperbolic  class:  When  ^(h)  =  \nb  and  x(f)  =  y  ,  /  >  0,  the  corresponding  GC  in 

equation  (18)  is  the  hyperbolic  class.^^’ Thus,  any  member  of  the  hyperbolic  class 
can  be  written  as 


=  f  r 


inl- 

.  fr 


b.p 


Vf(b,j3)e^^’^dbd/3 , 


where  (t>!j.^\b,fi)  =  (equation  (18))  is  a  two-dimensional  kernel  that  uniquely 

charaeterizes  the  hyperbolie  QTFR  and  is  the  signal  product  V^{b,^)  in  equation 

(18)  when  ^(h)  =  In  b.  Any  member  of  the  hyperbolic  class  satisfies  two  eovariance  properties 
that  are  special  cases  of  the  two  covariance  properties  satisfied  by  all  GC  QTFRs.  Specifically, 
the  generalized  time-shift  covarianee  of  the  GC  in  equation  (13)  simplifies  to  the  hyperbolic 

time-shift  covarianee  of  the  hyperbolic  class  defined  as  Y/)  ~ 


^  c  ^ 

^ - .  / 

/ 


,  since 


1^-1  o  _  r){i'^) _  zj 

^  c  ~  (note  that  IFb  is  the  warping  operator  in  equation  (16)  when 

fr 


^  (h)  =  \nb,  and  is  defined  in  equation  (14)  with  ^  (b)  =  lnZ>).  The  generalized  warped 
frequency-shift  covariance  in  equations  (20)  and  (21)  simplifies  to  the  scale  covariance  in 

V 

equation  (10),  since  IF,;'  M^W^„  =  y'”  =  CJr ;  thus,  hyperbolic  class  QTFRs  preserve  hyperbolic 

time  shifts  and  scale  changes  on  the  analysis  signal.  An  important  member  of  the  hyperbolic 
class  is  the  Altes  Q-distribution^’’^^  which  corresponds  to  the  generalized  warped  Wigner 
distribution  in  equation  (22)  when  ^(6)  =  In  b.  The  Altes  Q-distribution  is  defined  as 


AD,(t,f)  =  )X  (fe  , 


(26) 
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and  is  ideal  for  analyzing  signals  with  hyperbolic  group  delay  characteristics  such  as  Doppler- 
invariant  signals  that  are  similar  to  the  signals  used  by  bats  for  echolocation.^^  For  example,  the 
Altes  Q-distribution  of  a  hyperbolic  impulse,  defined  in  the  frequency  domain  as 

j 


X(f)  = 


I  -jlncln 


47 


fr 


for /  >  0,  is  a  Dirac  delta  function  centered  at  the  signal’s  group  delay 


function,  i.e.,  AD^  (/,/)  =  y  <5 


t - 

V  fj 


(23)  and  (24)  when  '^{b)  =  In  and  x(f)  = 


.  />o.  Note  that  this  is  a  special  case  of  equations 
1 


/ 


2.  Kth-power  warped  Cohen’s  class:  When  iib)  =  i^{b)  =  sgn{b)\b^ ,  k  and 


Af)  =  = 


K 

K 

/ 

fr 

fr 

K-l 


,  the  GC  in  equation  (18)  is  the  Kth  power  warped  Cohen’s 


class.^^’"*'’^”  The  generalized  time-shift  covariance  of  the  GC  in  equation  (13)  simplifies  to  the 
Kth  power  time-shift  covariance  defined  as  f)  =  -  cxjf),f),  since 


(where  Df'^’and  are  defined  in  equations  (14)  and  (16),  respectively, 

"  A  ' 


with  IJ^b)  =  i^Jb)),  but  equation  (20)  does  not  simplify  to  any  known  covariance.  Members  of 
this  class  include  the  Kth  power  Wigner  distribution  and  the  power  spectrogram."*^’ 
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Table  4.  Examples  of  QTFR  Classes  from  the  Generalized  Warped  Cohen’s  Class  with 
Associated  Function  ^(6),  Generalized  Time  Shift  xif),  and  Covariance  Operators 
Corresponding  to  Equations  (13)  —  (14)  and  (20)  -  (21) 


Characteristic 

Class  Functions 

Examples  of  Generalized  Warped  Cohen’s  Class 

Cohen’s 

Class 

Hyperbolic 

Class 

Kth  Power  Warped 
Cohen’s  Class 

Exponential 
Warped 
Cohen’s  Class 

Function  ^{b) 

b 

\nb 

Generalized  Time  Shift  t(/) 

1 

Jr 

1 

/ 

^k(/) 

L 

Tr 

Covariance  Operator  ^  in 

Equation  (14) 

Sc 

fr 

II 

Covariance  Property  in  Equation 
(13) 

Constant 

Time-Shift 

Hyperbolic 
Time- Shift 

Power 

Time-Shift 

Exponential 

Time-Shift 

Covariance  Operator  ^  in 

Equation  (21) 

Mv 

v(exp) 

Covariance  Property  in  Equation 
(20) 

Frequency- 

Shift 

Scale 

Power  Warped 
Frequency-Shift 

Exponential 

Warped 

Frequency-Shift 

4.2  GENERALIZED  WARPED  AFFINE  CLASS 

4.2.1  Generalized  Warped  Affine  Class  Formulation  and  Covariance  Properties 

The  generalized  warped  affine  class  (GA)  is  obtained  by  warping  the  time-shift  and  scale 
covariant  affine  QTFR  class^®’  using  equation  (15)  with  (class)  =  (A).  Similar  to  the 

generalized  warped  Cohen’s  class,  the  GA  is  a  generalized  time-shift  covariant  class,  since  both 
the  GC  and  the  GA  are  obtained  by  warping  constant  time-shift  covariant  classes  using  equation 
(15).  However,  in  comparison  to  GC  QFTRs,  the  GA  satisfies  an  additional  covariance  property 
that  is  based  on  the  scale  covariance  property  of  the  affine  class;  thus,  it  yields  new  QTFRs  that 
are  useful  in  different  type  of  applications.  Applying  the  warping  in  equation  (15)  to  the  affine 
class  QTFR  formulation  in  equations  (11)  and  (12),  any  GA  QTFR  can  be  written  as 
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Tr«.j)  =  r^,llrr> 


f,  '' 


'  i(^)  iC-f) 

J  r  J  r 

\  Jr  Jr  J 


•XtfJX  (f,)e 


A  III 


df,df„ 


J_  r  r 

f  uu  T 

\i(y) 

J  r 


-b 


i(-j) 

Jr  Jr  J 


VJb,p)e^"(f)^dbdp, 


(27) 


(28) 


where  Vxib,^)  is  defined  in  equation  (19)  and  +  -b  -  ^)  is  a  two- 

dimensional  kernel  function  that  uniquely  characterizes  the  GA  QTFR  The  kernels  in 
equations  (27)  and  (28)  are  identical  in  form  to  the  corresponding  affine  class  kernels  in 
equations  (11)  and  (12),  i.e.,  r!^°^^(b,,b,)  =  rf(b,.b,)  and  d>!^^^(b,p)  =  ^f(b,p)?^  Thus,  if 
the  kernel  of  an  affine  QTFR  is  known,  then  the  corresponding  generalized  warped  version  of 
that  affine  QTFR  uses  the  same  kernel  function.  The  GA  provides  a  generalization  of  the  affine 
class  since,  when  l^{b)  =  A ,  the  G^l  QTFR  formulation  in  equations  (27)  and  (28)  simplifies  to  the 
affine  class  QTFR  formulation  in  equations  (11)  and  (12),  respectively. 


Due  to  the  warping,  the  constant  time-shift  covariance  property  in  equation  (1)  of  the 
affine  class  QTFRs  is  transformed  to  the  generalized  time-shift  covariance  property  in  equations 
(13)  and  (14)  of  the  GA  QTFRs.  Similarly,  the  scale  covariance  property  in  equation  (10)  of  the 
affine  class  QTFRs  is  transformed  to  the  generalized  warped  scale  covariance  property  defined 
as 


at 


7(4  ^(/«)) 


K/) 


(29) 
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where  l( f,a)  =  ^  ^(—^(^))-  The  covariance  property  in  equation  (29)  follows  since  the 

J  r 


scale  operator  in  the  affine  class  maps  to  W.  ^CJV.  =  ,  which  transforms  the  signal  as 


(L^!>X)(f) 


X(fMf,a)). 


(30) 


The  covariance  property  in  equation  (29)  is  not  easy  to  understand,  but  simplifies  to 
known  covariance  properties  for  the  following  functions:  when  =  ^^(b)  =  sgn(b)\b\'^ , 

equation  (29)  results  in  the  scale  covariance  property  in  equation  (10)  since 

=  C^^,^(a).  When  ^{b)  =  \ub,  equation  (29)  results  in  the  power  warping 

covariance  property^'*’  defined  as 


fr^x(f)  .V 


fr 


since  =  1^'"^  =  (where  ^^b)  =  sgn(b)\b\'\  Furthermore,  when  ^(b)  =  e‘’ , 

equation  (29)  results  in  the  frequency-shift  covariance  property  in  equation  (2)  since 
W  =  =  Mr,  (see  table  5). 

exp  exp  a  f^Xna  ^ j. 
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Table  5.  Examples  of  QTFR  Classes  from  the  Generalized  Warped  Affine  Class  with 
Associated  Function  ^(b),  Generalized  Time  Shift  x0),  and  Covariance  Operators 
Corresponding  to  Equations  (13)  -  (14)  and  (29)  -  (30) 


Characteristic 

Class  Functions 

Examples  of  Generalized  Warped  Affine  Class 

AffineClass 

Hyperbolic- 
ally  Warped 
Affine  Class 

Kth  Power 
Class 

Exponential 

Class 

Function  <f(6) 

b 

Xnb 

Generalized  Time  Shift  t(/ ) 

1 

fr 

1 

/ 

Xk(/) 

1  f 

j/ 

Covariance  Operator  ^  in 

Equation  (14) 

fr 

n(^K) 

s,=D<r' 

Covariance  Property  in  Equation 

jm _ 

Constant 

Time-Shift 

Hyperbolic 

Time-Shift 

Power 

Time-shift 

Exponential 

Time-shift 

Covariance  Operator  in 

Equation  (30) 

Ca 

Wr 

c 

Covariance  Property  in  Equation 
(29) 

Scale 

Power 

Warping 

Scale 

Frequency-Shift 

4.2.2  Generalized  Warped  Affine  Class  Members 

GA  QTFRs  are  specified  by  their  kernel  P)  in  equation  (28)  and  are  obtained 

simply  by  warping  known  affine  class  QTFRs.  One  important  member  of  the  GA  is  the 
generalized  warped  Wigner  distribution  discussed  in  section  4.1.2  that  is  obtained  by  warping  the 
Wigner  distribution  as  in  equation  (22).  Since  the  Wigner  distribution  is  a  member  of  the 
intersection  between  Cohen's  class  and  the  affine  class,  the  generalized  warped  Wigner 
distribution  in  equation  (22)  is  a  member  of  the  intersection  between  the  GC  and  the  GA. 

Another  important  member  of  the  GA  that  may  be  used  to  remove  cross  terms  in  multicomponent 
signal  analysis  applications  is  the  generalized  warped  version  of  the  scalogram,  which  is  the 
generalized  warped  version  of  the  squared  magnitude  of  the  wavelet  transform. 


33 


4.2.3  Generalized  Warped  Affine  Class  Examples 


The  generalized  warped  affine  power  class  and  exponential  class  examples  listed  below 
are  obtained  by  appropriately  choosing  the  one-to-one,  differentiable,  and  invertible  function 
in  equations  (27)  or  (28).  Note  that  the  classes  are  also  summarized  in  table  5  with  some 
additional  examples. 

1.  Power  class:  When  ^(b)  =  =  sgn(b)\b\^  dind  T(f)  =  Tjf)  =  ^ 

Jr 

the  corresponding  GA  class  in  equation  (28)  is  the  Kth  power  class  k  ^  Any  member  of  the 

Kth  power  class  can  be  written  as 


1 


(jjf, 


f  r  (D 

J-oo  J-co 


(P) 

T 


-b 


iJj)  iJj) 


y(PK) /u  p 


fb,P)e 


dbdp  , 


where  (b!ffb,P)  =  (equation  (28))  is  a  two-dimensional  kernel  that  uniquely 

characterizes  the  power  QTFR  and  V^^'fb,P)  is  the  signal  product  Vx  (Z>,P)  inequation 
(28)  when  ^(b)  =  ijb).  Power  class  QTFRs  satisfy  two  simplified  covariance  properties  that 
correspond  to  the  generalized  time-shift  covariance  in  equation  (13)  and  the  generalized  warped 
scale  covariance  in  equation  (29)  when  ^(6)  (Z>).  Thus,  power  class  QTFRs  satisfy  the 

power  time-shift  covariance  property  defined  as  (t>f )  =  f  ),f )  ,  since 

Note  that  is  defined  in  equation  (14)  with  Ifb)  —  ^^(b),  and  is 

"  fr  ' 

defined  in  equation  (16)  with  ^(b)  -  ^Jb).  Power  QTFRs  also  satisfy  the  scale  covariance 
property  in  equation  (10)  since  =  Q,  rar  Thus,  power  QTFRs  are  useful  for 

K 

analyzing  signals  passing  through  systems  with  power  time- frequency  characteristics.  An 

or  orf 

important  member  of  the  Kth  power  class  is  the  Kth  power  Wigner  distribution,  ’  which  is  the 


34 


generalized  warped  Wigner  distribution  in  equation  (22)  when  i/b)  =  ^Jb).  The  Kth  power 
Wigner  distribution  is  defined  as 


WD</'(l.f)  = 


[A 


2. 


1- 

V  2y; 


e  ^ 


(31) 


ll- 


2k 


and  it  reduces  to  the  Wigner  distribution  when  k  =  1 .  Another  important  member  of  the  Kth 
power  class  is  the  powergram,  which  is  the  squared  magnitude  of  a  power  wavelet  transform.^^ 


I  ^ 

2.  Exponential  class:  When  ^;^(b)  =  e*  and  r(f)  =  — e^',  the  corresponding  GA  class 

f r 

in  equation  (28)  is  the  exponential  class.'*^’'*^’^®  The  generalized  time-shift  covariance  property 
in  equation  (13)  simplifies  to  the  exponential  time-shift  covariance  property  defined  as 

/ 

T'iy'AtJ)  =  TA^At-jre^',  f),  since  Here,  and  are 

/r 


defined  in  equations  (14)  and  (16),  respectively,  with  ^(b)  =  A’ .  Moreover,  the  generalized 
warped  scale  covariance  property  in  equation  (29)  simplifies  to  the  frequency-shift  covariance  in 
equation  (2)  since  ^ /Am  •  Some  important  members  of  the  exponential 

class  include  the  exponential  Wigner  distribution  and  the  exponogram."*^ 
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5.  IMPLEMENTATION  OF  GENERALIZED  TIME-SHIFT  CO  VARIANT  QTFRs, 

INCLUDING  POWER  CLASS  QTFRs 


The  GC  and  GA  QTFRs  can  be  implemented  directly  using  their  formulation  in  equations 
(18)  and  (28),  respectively.  However,  the  direct  implementation  technique  is  computationally 
intensive,  as  it  involves  a  two-dimensional  integration.  A  less  complicated  technique  is  to 
implement  the  warping  transformation  in  equation  (15)  because  algorithms  already  exist  to 
compute  Cohen’s  class  QTFRs  and  affine  class  QTFRs,  and  the  only  new  algorithms  needed  are 
the  ones  to  warp  the  analysis  signal  as  in  equation  (16),  and  to  transform  the  time-frequency  axes 
as  in  equation  (15).  For  example,  the  Altes  Q-distribution  in  section  4.1.3  can  be  computed 
directly  using  equation  (26)  or  indirectly  as  the  hyperbolically  warped  version  of  the  Wigner 
distribution  in  equation  (5),  i.e.. 


ADJt,J)  =  WD, 


,  frln^  , where =  \e^'X\  f^e 


Note  that  equation  (32)  corresponds  to  the  warping  in  equations  (15)  and  (16)  with  ^(Z>)  =  \xvb. 
Since  simple  and  efficient  algorithms  exist  to  compute  the  Wigner  distribution  in  equation  (5),  to 
compute  the  Altes  Q-distribution,  the  signal  must  first  be  warped  with  the  exponential  mapping 
X{f)  (H’in2f)(/),  then  the  Wigner  distribution  of  the  warped  signal  must  be  computed,  and 
finally  the  time  and  frequency  axes  must  be  transformed  for  the  correct  time- frequency 


localization  using  (t,f)^ 


Similarly,  any  member  of  the  hyperbolic  class 


can  be  implemented  in  the  same  way  simply  by  replacing  the  Wigner  distribution  in  equation 
(32)  with  the  corresponding  member  of  Cohen’s  class.^'*  Documented  discrete  implementation 
realization  of  the  hyperbolic  class  using  the  warping  approach,  the  power  class  QTFRs,^^’  and 
the  exponential  class  QTFRs'^*’'*^  by  appropriately  warping  the  affine  class  QTFRs  can  be  found 
in  Canfield^^  and  Praveenkumar.®°  An  in-depth  description  of  the  three  steps  of  the  discrete 
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implementation  of  the  power  class  QTFRs  is  provided  in  Hlawatsch.  The  discrete 
implementation  of  the  exponential  class  QTFRs  was  also  realized,  and  the  necessary 
computational  steps  were  documented  and  can  be  found  in  Papandreou-Suppappola."*^ 

As  presented  in  section  4.2.3,  any  Kth  power  QTFR  t,f )  can  be  obtained  by 

warping  a  corresponding  affine  class  QTFR  )  using  equations  (15)  and  (16)  with 

((b)  =  (Jb)  =  sgn('(>J|*r.  i.e.. 


=  r. 


(A) 


W^X 


— i —  ft(—) 


(33) 


where  the  power  warping  operator  is  defined  as 


(W^X)(f)  = 


l-K 

2k 


X 


f  ^ 

friXjr) 
y:  Jr  J 


(34) 


The  discrete  implementation  of  power  QTFRs  is  based  on  the  warping  relations  in  equations  (33) 
and  (34),  which  allow  the  use  of  existing  efficient  algorithms  for  computing  affine  class 
QTFRs.^^  This  implementation  technique  is  conceptually  similar  to  the  implementation 
technique  used  for  hyperbolic  class  QTFRs^^  in  Canfield^^  and  Praveenkumar.^*^  The  first  step 
consists  of  a  power-law  frequency  warping  of  the  signal  X(f)  according  to  equation  (34),  the  next 

step  is  a  computation  of  the  affine  QTFR  of  the  warped  signal  and  lastly  a  nonlinear 

time-frequency  coordinate  transform  is  performed  according  to  equation  (33),  i.e.. 


(tj)^ 


frUf) 


J  J 
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For  the  discrete  implementation  of  the  first  step,  the  discrete  version  of  the  signal’s 
Fourier  transform  X[l](l  =  0,1,...,L-1^  with  jfrequency  sample  spacing  A/is  first 
interpolated  (upsampled)  by  factor  u,  yielding  the  Fourier  transform  samples 
X' [I' ]  (r  =  0, 1, ...  ,uL  - 1 ).  Next,  discrete  warped-frequency  locations  are  computed  according 
to  equation  (34),  such  that  uniform  sampling  of  the  warped-frequency  axis  is  achieved;  these 


discrete  warped-frequency  locations  are  given  by 

TC 


(m  =  0,1,. ...M-U 


where  Au  is  the  frequency  sample  spacing  of  the  warped  Fourier  transform,  computed  such  that 
|/m+i  ~  fm\-  4/^.  V/w,  and  M  is  the  number  of  warped  Fourier  transform  samples  required  to 

represent  the  entire  frequency  domain.  The  Fourier  transform  value  at  each  warped-frequency 
location  is  obtained  by  linear  interpolation  of  the  closest  neighbors  in  the  upsampled  Fourier 
transform  X'[/'].  This  gives  a  discrete  warped- frequency  Fourier  transform 
Y[m]  (m  =  0, 1, . . . ,  M  - 1).  In  order  to  avoid  time-aliasing  effects  in  the  subsequent 
computation  of  the  affine  QTFR,  the  warped-frequency  Fourier  transform  Y[rri\  is  finally 
interpolated  (upsampled)  to  bandlimit  the  Fourier  transform  to  one  quarter  of  the  sampling  rate, 
yielding  Y' [m' ]  (m'  =  0, 1, . . .  ,M'-1 ) ,  where  M  =  2M 


In  the  second  step,  a  discrete-time,  discrete-frequency  version  of  the  affine  QTFR 

is  computed  for  the  warped-frequency  signal  The  last  step  computes  the 


warped  time-frequency  locations  f^)  =  - . -  ,  f.  ,  where 

Jr^K\  J i) 


{ i/Sf  1  f  Au  ^ 

/  =  fr^K  /  "7"  such  that  uniform  two-dimensional  sampling  in  the 

<  Jr  )  \  Jr  j 


warped  time-frequency  domain  is  achieved,  and  subsequently  calculates  the  corresponding 
sample  values  using  linear  two-dimensional  interpolation.  Note  that  the  Kth  power- 

warped  Cohen’s  class  in  section  4.1.3  can  be  implemented  using  the  same  three  steps  above. 
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except  that  the  affine  class  QTFR  in  the  second  step  must  be  replaced  with  a  Cohen’s  class 
QTFR.  The  MATLAB  program  for  the  implementation  of  the  power  Wigner  distribution  is 
given  in  section  A. 2  in  the  appendix.  The  power  Wigner  distribution  in  equation  (31)  is  obtained 
by  warping  the  affine  Wigner  distribution  using  equations  (33)  and  (34).  ’ 
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6.  TIME-FREQUENCY  ANALYSIS  OF  SONAR  CLICKS 


As  clicks  are  short-duration  pulses  (impulse  like)  in  time  and  thus  broadband  in 
frequency,  they  are  expected  to  have  constant  group  delay  characteristics.  As  a  result,  clicks 
may  be  analyzed  by  QTFRs  that  preserve  constant  group  delay  characteristics  in  the  time- 
frequency  plane.  Thus,  clicks  may  be  well  matched  to  Cohen’s  class  of  time-frequency  shift 
covariant  QTFRs,  such  as  the  Wigner  distribution  and  the  spectrogram  presented  in  section  3.1. 


As  the  sound  records  have  long  sample  durations,  the  MATLAB  function  in  section  A.  1 
in  the  appendix  allows  small  sections  of  the  data  to  be  read,  starting  at  a  sample  offset.  The 
MATLAB  function  in  section  A.3  in  the  appendix  plots  the  data  section  in  time  and  frequency, 
and  also  computes  its  Wigner  distribution  and  spectrogram.  This  function  is  used  to  analyze  a 
single  click  as  the  click  duration  is  small,  extending  only  over  a  small  section  of  data  samples. 


The  Wigner  distribution  of  the  impulse  X(  f ) 


-jTsic-t 

e  ' 


is  a  Dirac  delta  function  centered 


c  1  c 

at  the  signal’s  constant  group  delay  function  t  =  — ,  i.e.,  WD^(t,f )  =  t - ) .  Note  that 

fr  Jr  fr 


this  is  a  special  case  of  the  GWWD  of  a  generalized  impulse  when  ^(b)  =  b  and  r(f)  =  —  in 

f r 

equations  (23)  and  (24).  Thus,  as  the  click  shows  similar  time-frequency  characteristics  as  an 
impulse,  it  is  expected  that  the  Wigner  distribution  of  a  single  click  will  have  a  very  short  time 
duration,  equivalent  to  the  duration  of  a  click,  that  extends  over  many  frequencies  due  to  its 
broadband  nature.  For  example,  the  time  domain  signal  is  plotted  in  figure  2(a);  for  comparison, 
the  Wigner  distribution  of  a  single  click  (shown  in  figure  1)  of  a  white-sided  dolphin  from  file 
75001012.kay  is  shown  in  figure  2(b).  Indeed,  the  Wigner  distribution  shows  that  the  click  has  a 
short  duration  and  is  relatively  broadband.  Thus,  the  Wigner  distribution  provides  good  time- 
frequency  resolution  for  single  clieks.  Note  that  although  the  data  were  filtered  to  remove  low 
frequencies  due  to  recording  equipment  or  engine  noise,  the  data  is  still  noisy,  and  the  Wigner 
distribution  shows  some  interference  terms.  This  interference  is  further  removed  using  the 
spectrogram  in  equation  (8)  as  shown  in  figure  2(c),  but  since  the  spectrogram  uses  a  window  to 
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smooth  out  the  interference,  it  is  not  as  localized  as  the  Wigner  distribution.  Clicks  can  also  be 
analyzed  by  other  Cohen’s  class  QTFRs,  such  as  the  pseudo-Wigner  distribution  and  the 
smoothed  pseudo-Wigner  distribution  using  the  MATLAB  function  given  in  section  A.4  in  the 
appendix.  For  comparison,  this  function  also  computes  some  hyperbolic  QTFRs  (section  4.1.3) 
such  as  the  Altes  Q-distribution  and  the  pseudo- Altes  Q-distribution,  and  some  power  QTFRs 
(section  4.2.3),  such  as  the  power  Wigner  distribution  and  the  power  pseudo-Wigner  distribution. 
The  hyperbolic  QTFRs  and  the  power  QTFRs  are  not  matched  to  signals  with  a  constant  group 
delay  function,  ’  thus,  they  are  not  ideal  for  the  analysis  of  clicks. 


The  smoothed  versions  of  the  Wigner  distribution,  such  as  the  spectrogram  or  the  pseudo- 
Wigner  distribution,  are  important  when  a  click  train  is  analyzed  (versus  a  single  click),  as  they 
reduce  the  oscillatory  cross  terms  that  result  in  the  Wigner  distribution  (equation  (6))  that  may 
impede  analysis.  Figure  3(a)  shows  a  series  of  clicks  (click  trains)  in  time  of  a  long-finned  pilot 
whale  from  data  file  75010001  .kay.  The  Wigner  distribution  in  figure  3(b)  suffers  from  cross 
terms  between  any  two-click  terms  and  from  the  noise  present  in  the  data.  For  the  sum  of  two 


impulses,  X(  f )  = 


-jiKcrj 

(e 


-jlnciy 

+  e  J ,  the  Wigner  distribution  results  in  the  sum  of  two 


\  c  c 

auto  terms  -—(6(t - -)  +  5(t - -) )  and  in  one  cross  term  given  by 

Jr  fr  fr 


fr 


f  c+c 

cos  2n(c,  -cjjr  d  t- 

V  Jrjy 


.  Thus,  the  cross  term  oscillates  in  the  frequency 


direction  and  is  centered  at  t  ^  (equation  (25)  with  ^(Z>)  =  6)).  Note  that  there  are  no 

'^fr 

cross  terms  oscillating  in  the  time  direction.  The  pseudo-Wigner  distribution  in  figure  3(c)  uses 
a  short-duration  Hanning  window  to  smooth  out  cross  terms  in  the  frequency  direction  only.  The 
pseudo-Wigner  distribution  is  defined  in  equation  (9)  with  g(f)  =  5(/)  and  which  is  a 

short-duration  window  for  stronger  smoothing  in  the  frequency  direction.’*  As  the  click  train 
may  be  considered  as  the  sum  of  the  impulses,  the  Wigner  distribution  cross  terms  are  centered 
at  a  constant  group  delay  curve,  and  are  found  along  the  frequency  direction.  Thus,  the  pseudo- 
Wigner  distribution  in  figure  3(c)  has  successfully  reduced  the  cross  terms  with  only  some  loss 
of  time-frequency  resolution,  and  it  clearly  shows  the  various  clicks.  There  are  about  five  click 
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trains  present  with  about  two  to  five  clicks  per  click  train;  the  click  trains  are  about  0.5  ms  apart, 
and  the  clicks  in  each  click  train  are  about  35  ps  apart.  On  the  other  hand,  the  spectrogram  does 
not  allow  separable  time  and  frequency  smoothing,  and  would,  as  a  result,  have  more  loss  of 
time  and  frequency  resolution. 

In  studying  the  nature  of  the  click  bursts  present  in  the  data  files,  table  6  compares  the 
duration  of  the  click  burst,  the  start  time  of  the  click  burst,  the  number  of  clicks  per  burst,  and  the 
instantaneous  click  rate  (number  of  clicks  per  second  in  intervals  of  one-tenth  of  the  duration  of 
the  click  burst)  for  various  click  bursts.  The  click  bursts  usually  lasted  from  0.04  -  0.60  s,  and 
the  number  of  clicks  per  burst  ranged  from  37  -  892  clicks.  The  instantaneous  click  rate  showed 
that  in  some  cases,  the  number  of  clicks  per  second  increased  and  then  decreased,  whereas  in 
other  cases,  the  number  of  clicks  per  second  kept  increasing  or  decreasing.  Without  any 
additional  information  on  the  mammars  behavior  at  the  time  of  recording,  it  is  difficult  to 
determine  the  behavior  of  the  instantaneous  click  rate.  Another  difficulty  in  obtaining  the 
instantaneous  click  rate  was  that,  if  the  duration  of  the  click  burst  exceeded  0.6  s  and  was 
combined  with  closely  spaced  clicks,  it  was  not  easy  to  determine  how  many  clicks  were  present. 
As  analysis  tools,  the  time  signal  was  used  as  well  as  the  spectrogram,  but  the  spectrogram 
provided  a  faster  implementation  for  longer  sections  of  click  bursts.  Note,  however,  that  as  the 
duration  of  the  click  burst  increased,  it  was  difficult  to  differentiate  the  closely  spaced, 
broadband  clicks. 
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Frequency  (Hz)  Frequency  (Hz)  Amplitude 


4 


(a) 


x10 


Time  (s) 


1.2392  1.2394  1.2396  1.2398  1.24  1.2402  1.2404 

Time  (s) 


Figure  2.  The  Single  Sonar  Click  of  the  White-Sided  Dolphin  from  Data  File  75001012.kay 
from  Figure  1:  (a)  Signal  in  the  Time  Domain,  (b)  Wigner  Distribution,  and 

(c)  Spectrogram  of  the  Click 
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Frequency  (Hz)  Frequency  (Hz)  Amplitude 


0,1815  0.182  0.1825  0.183  0,1835  0.184 

Time  (s) 


0.1815  0.182  0.1825  0.183  0.1835  0.184 

Time  (s) 


0.1815  0,182  0.1825  0.183  0.1835  0.184 

Time  (s) 


Figure  3.  Click  Trains  of  a  Long-Finned  Pilot  Whale  from  the  Data  File  7501000I.kay: 
(a)  Signal  in  Time,  (b)  Wigner  Distribution,  (c)  Pseudo-Wigner  Distribution 
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Table  6.  Comparison  of  Various  Statistics  of  Marine  Mammal  Click  Bursts 


Table  6.  Comparison  of  Various  Statistics  of  Marine  Mammal  Click  Bursts  (Cont’d) 


Mammal 


File  Number  Duration  Start  Time  Clicks/Burst  Instantaneous  Click  Rate 


tClicks/s  “>Time(s) 


7.  TIME-FREQUENCY  ANALYSIS  OF  WHISTLES 


Whistles  are  continuous,  frequency-modulated,  narrowband,  nonstationary  sounds  whose 
time-frequency  characteristics  vary  in  duration,  bandwidth,  and  shape  depending  on  the  marine 
mammal;  i.e.,  their  frequency  content  varies  considerably  and  in  different  ways  during  the  time 
duration  of  the  sound.  For  example,  it  is  reported’^  that  the  Atlantic  pilot  whale  has  seven 
categories  of  whistle  time-frequency  structures,  as  follows: 

1 .  Level  frequency  (essentially  no  change  in  frequency  throughout  the  entire  duration  of 
the  whistle). 

2.  Falling  frequency  (decrease  in  frequency  throughout  the  duration  of  the  whistle). 

3.  Rising  frequency  (increase  in  frequency  throughout  the  duration  of  the  whistle). 

4.  Up-down  frequency  (frequency  rise  followed  by  frequency  fall). 

5.  Down-up  frequency  (frequency  fall  followed  by  frequency  rise). 

6.  Multiple  humps  frequency  (frequency  has  at  least  two  inflections). 

7.  Waver  frequency  (frequency  has  at  least  two  inflections  symmetrical  about  some 
mean). 

Thus,  for  different  mammals,  the  characteristic  whistle  structure  may  show  a  rise  or  fall  in 
frequency  or  no  change  in  frequency,  or  the  structure  may  fall  hyperbolically  (as  1 /frequency),  or 
either  change  sinusoidally  or  as  a  function  of  some  power.  In  many  cases,  the  time-frequency 
structure  may  consist  of  a  combination  of  various  frequency  modulation  changes.  Thus,  the 
whistle’s  group  delay  characteristic  changes  from  mammal  to  mammal.  Due  to  the  large 
variations  in  whistle  structures,  Cohen’s  class  QTFRs  may  not  yield  the  true  time-frequency 
structure  of  all  possible  whistles,  as  these  QTFRs  are  well  matched  only  to  constant  or  linear 
changes  in  group  delay.  Some  whistles  may  be  better  matched  to  the  generalized  time-shift 
covariant  QTFRs  of  section  4  that  take  into  account  the  whistle’s  characteristic  group  delay 
structure  (for  example,  a  hyperbolic  QTFR  may  be  well  matched  to  a  whistle  with  a  1/frequency 
modulation  characteristic).  In  the  next  section,  various  QTFRs  are  used  to  analyze  the  different 
types  of  whistles. 
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7.1  SPECTROGRAM  ANALYSIS  OF  WHISTLES 


The  spectrogram  of  various  whistles  was  computed  to  demonstrate  the  whistle  variations 
in  time-frequency  structure  using  the  built-in  MATLAB  function  specgram.m.  The  spectrogram 
was  used  as  it  is  computationally  efficient  for  the  large  data  records.  The  spotted-dolphin 
whistles  in  sound  file  83 006023. kay  are  shown  in  figure  4  to  have  a  power  and  linear  time- 
frequency  characteristic  structure,  whereas  the  whistles  of  the  spotted  dolphins  in  sound  file 
9002709d.kay  have  sinusoidal  time-frequency  characteristics,  as  shown  in  figure  5.  The  long- 
finned  pilot  whale  whistles  often  have  hyperbolic  time-frequency  characteristics,  as  shown  by 
the  spectrogram  of  data  file  7703500w.kay  in  figure  6.  The  spectrogram  of  the  white-sided 
dolphin  whistles  from  data  file  750010 16.kay,  shown  in  figure  7,  shows  a  linear/power  fall  in 
frequency.  The  data  files  may  also  contain  several  different  types  of  whistles,  as  is  the  case  with 
the  striped  dolphins  from  data  file  7200700f  kay  in  figure  8;  the  frequency  seems  to  be  rising  and 
falling  in  various  ways  during  the  sound  duration.  Most  of  the  figures  also  contain  broadband 
clicks  and,  although  the  figures  show  some  characteristic  time-frequency  whistle  structures  for 
different  marine  mammals,  other  structures  are  also  possible  by  the  same  mammal.  For  analysis 
purposes,  the  data  have  been  preprocessed  to  remove  low  frequencies  due  to  recording 
equipment  or  engine  noise  and  have  been  decimated  to  reduce  the  number  of  samples. 

In  order  to  quantify  the  data,  table  7  was  compiled  with  various  statistics  of  the  marine 
mammaTs  whistles  for  comparison  purposes.  The  table  includes  the  type  of  marine  mammal 
producing  the  whistle,  the  file  from  which  the  data  was  obtained,  and  the  duration,  start  time 
(from  the  beginning  of  the  file),  bandwidth,  lowest  frequency,  and  initial  frequency  of  the 
whistle,  as  well  as  a  plot  of  the  whistle’s  idealized  time-frequency  structure.  The  spectrogram 
was  used  as  an  analysis  tool  because  of  its  speed  and  because  it  provides  sufficient  means  to 
obtain  these  types  of  measurements.  The  results  of  the  analysis  closely  agree  with  previously 
published  results,*’’  and  the  following  characterizations  can  be  derived  from  the  data: 
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Figure  4.  Spectrogram  of  Spotted-Dolphin  Whistles  from  Data  File  83OO6023.kay 
Demonstrating  Power  and  Linear  (Rise  and  Fall)  Time-Frequency  Characteristics 


Time  (s) 


Figure  5.  Spectrogram  of  Spotted-Dolphin  Whistles  from  Data  File  9002709d.kay 
Demonstrating  Sinusoidal  (Rise,  Fall,  and  Rise)  Time-Frequency  Characteristics 
(Sound  file  also  contains  humpback  and  sperm  whales.) 


Frequency  (Hz) 


Figure  6.  Spectrogram  of  Long-Finned  Pilot  Whale  Whistles  from  Data  File  7703500w.kay 
Demonstrating  Hyperbolic  Time-Frequency  Characteristics  (Sound file  also  contains  sperm 

whales.) 
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Frequency  (Hz) 
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Figure  8.  Spectrogram  of  Striped  Dolphin  Whistles  from  Data  File  7200700f.kay 
Demonstrating  Various  Types  of  Time-Frequency  Characteristics 
(Sound file  also  contains  sperm  whales.) 
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Table  7.  Comparison  of  Various  Statistics  of  Marine  Mammal  Whistles  (Cont’d) 
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Table  7.  Comparison  of  Various  Statistics  of  Marine  Mammal  Whistles  (ConVd) 
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Table  7.  Comparison  of  Various  Statistics  of  Marine  Mammal  Whistles  (Cont’d) 


Table  7.  Comparison  of  Various  Statistics  of  Marine  Mammal  Whistles  (Cont’d) 
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Table  7.  Comparison  of  Various  Statistics  of  Marine  Mammal  Whistles  ( Cont’d) 
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Table  7.  Comparison  of  Various  Statistics  of  Marine  Mammal  Whistles  (Cont’d) 
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•  Long-fmned  pilot  whales  are  characterized  by  low-frequency-range  whistles.  The 
mean  initial  frequency  is  5.7  kHz,  the  mean  minimum  frequency  is  4.1  kHz,  and  the 
mean  bandwidth  is  3.1  kHz.  Rhythmically  repeated  individual  whistles  were  often 
noted.  Many  of  the  whistles  have  hyperbolic  time-frequency  characteristics; 
however,  other  time- frequency  structures  were  also  observed. 

•  White-sided  dolphins  are  characterized  by  high-frequency-range  whistles.  The  mean 
initial  frequency  is  15  kHz,  the  mean  minimum  frequency  is  9.1  kHz,  and  the  mean 
bandwidth  is  6.9  kHz.  The  whistles  often  seem  to  repeat  rhythmically.  Similar  to  the 
whistle  of  the  long-fmned  pilot  whale,  this  whistle  does  not  have  the  stereotypical 
sinusoidal  frequency  modulation.  The  whistles  appear  to  fall  in  frequency  in  a 
somewhat  linear  fashion. 

•  Spotted-dolphin  whistles  are  characterized  by  high-frequency-range  whistles.  The 
mean  initial  frequency  is  8.7  kHz,  the  mean  minimum  frequency  is  7.8  kHz,  and  the 
mean  bandwidth  is  10.4  kHz.  The  whistles  appear  to  have  power  time-frequency 
characteristics. 

•  Striped  dolphins  are  also  characterized  by  high-frequency-range  whistles.  The  mean 
initial  frequency  is  9.9  kHz,  the  mean  minimum  frequency  is  7.7  kHz,  and  the  mean 
bandwidth  is  6.2  kHz.  The  whistles  appear  to  have  the  stereotypical  sinusoidal 
frequency  modulation  and  have  both  rising  and  falling  time-frequency  components. 

Thus,  from  the  data  analyzed,  the  long-fmned  pilot  whale  whistles  start  at  a  lower  frequency  and 
have  narrower  bandwidths  than  the  white-sided,  spotted,  and  striped  dolphins.  Although  the 
time-frequency  structure  of  the  whistle  varied  greatly,  it  is  noted  that  the  observed  long-finned 
pilot  whale  whistles  have  a  hyperbolic  or  exponential  structure;  however,  other  studies  show  that 
the  long-fmned  pilot  whales  do  not  have  a  characteristically  exclusive  whistle  type.*'* 
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The  last  column  in  table  7  provides  a  convenient  and  intuitive  tool  for  interpreting  the 
time-frequency  characteristics  of  the  whistles.  It  also  provides  some  preliminary  information  on 
the  group  delay  structure  of  the  whistles  so  that  the  generalized  time  shift  of  the  generalized 
warped  QTFR  can  be  chosen  to  match  the  signal’s  group  delay  for  better  analysis  results.  The 
time-frequency  curves  in  the  last  column  of  table  7  were  obtained  by  selecting  time-frequency 
points  from  the  spectrogram  analysis  of  the  whistles,  and  then  by  using  cubic  spline  interpolation 
to  fit  the  selected  points. 

Additional  deductions  were  also  made  from  table  7  and  are  summarized  in  table  8  for 
long-finned  pilot  whales,  in  table  9  for  white-sided  dolphins,  in  table  10  for  spotted  dolphins,  and 
in  table  1 1  for  striped  dolphins.  The  tables  show  the  mean  value  and  the  standard  deviation  of 
the  initial  frequency,  the  minimum  frequency,  the  bandwidth,  and  the  duration  of  the  whistles  of 
each  marine  mammal.  Below  each  table,  histograms  of  the  various  statistics  are  also  given  to 
show  the  distribution  of  each  of  the  whistle  characteristies.  Based  on  the  computed  mean  and 
standard  deviation,  the  corresponding  Gaussian  distribution  was  also  computed  and 
superimposed  on  the  histogram.  Although  the  distribution  of  the  whistle  characteristic  is  not 
clearly  Gaussian,  the  superposition  helps  to  show  the  mean  value  of  the  whistle  characteristic. 

For  example,  the  initial  frequency  of  the  white-sided  dolphin  in  table  9  has  a  mean  value  of 
15.04  kHz  and  a  standard  deviation  of  3.44  kHz.  The  corresponding  histogram  for  the  initial 
frequency  in  table  9  shows  that  the  distribution  of  the  initial  frequencies  is  close  to  a  Gaussian 
distribution  with  the  given  mean  and  standard  deviation. 
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Table  8.  Whistle  Statistics  for  Long-Finned  Pilot  Whales 
(Statistical  sample  size  is  80  data  sets.) 
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Table  9.  Whistle  Statistics  for  White-Sided  Dolphins 
(Statistical  sample  size  is  61  data  sets.) 
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Table  10.  Whistle  Statistics  for  Spotted  Dolphins 
(Statistical  sample  size  is  29  data  sets.) 
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Table  11.  Whistle  Statistics  for  Striped  Dolphins 
(Statistical  sample  size  is  62  data  sets.) 


seconds 


For  some  of  the  data  files,  the  spectrogram  analysis  of  whistles  yielded  various  higher 
hannonics  as  well  as  the  first  harmonic  (belonging  to  the  actual  whistle).  This  phenomenon 
usually  occurred  in  files  consisting  of  spotted  dolphin  whistles,  together  with  clicks  and  pulsed 
clicks.  In  particular,  the  whistle  harmonics  appear  to  occur  in  the  region  of  the  pulsed  clicks. 
For  example,  figure  9  shows  the  spectrogram  of  data  file  8300603c.kay;  although  the  data 
information  that  came  with  the  file  states  that  one  whistle  is  present,  other  harmonics  are  clearly 
present.  It  is  important  to  investigate  whether  the  harmonics  effect  was  due  to  the  analysis  tool 
or  whether  it  was  either  a  biological  effect  or  a  recording  artifact.  Watkins  states  that, 
depending  on  the  spectrogram  window,  the  interpretation  of  spectrogram  analysis  composed  of 
pulsed  trains  (such  as  pulsed  clicks)  could  be  confusing.  This  is  because  as  the  pulse  rate 
increases  (that  is,  as  the  number  of  clicks  per  second  increases),  harmonics  result  at  multiple 
increments  of  the  pulse  rate  if  the  window  used  is  not  shorter  than  the  time  duration  between 
clicks.  The  harmonics  at  the  various  frequencies  now  appear  more  like  whistles,  since  their 
frequencies  change  as  the  pulse  rate  changes.  In  our  case,  however,  the  spectrogram  window 
length  and  the  pulse  rate  were  such  that  a  harmonic  effect  should  not  have  been  observed  in  the 
data.  In  addition,  figure  9  shows  that  the  pulsed  clicks  are  still  present.  This  indicates  that  the 
harmonics  are  actually  part  of  the  whistles,  not  the  clicks.  Further  analysis  using  the  time  signal 
and  the  Wigner  distribution  (that  have  no  windowing  effect)  show  that  the  harmonics  are  still 
present;  the  higher  harmonics,  therefore,  may  be  part  of  the  sounds  that  the  mammals  produce, 
and  the  higher  harmonic  (higher  frequencies)  indicates  a  decrease  in  harmonic  energy.  Another 
possibility  is  that  the  harmonics  are  the  result  of  the  recording.  In  either  case,  it  is  important  to 
know  that  the  harmonics  are  present  at  the  higher  frequencies,  even  if  their  energy  decreases. 
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Figure  9.  Spectrogram  of  a  Spotted-Dolphin  Whistle  from  Data  file  8300603c.kay 
(Note  the  four  whistle  harmonics  between  1.5  s  and  2. 7  s.) 
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7.2  OTHER  COHEN’S  CLASS  QTFRs  ANALYSIS  OF  WHISTLES 


7.2.1  Wigner  Distribution 

The  whistles  extend  over  a  large  number  of  samples  in  the  data,  as  shown  by  the  fact  that 
the  data  sampling  rate  is  166.66  kHz,  and  the  mean  values  of  the  duration  of  the  various  whistles, 
as  shown  in  tables  8  through  1 1,  range  from  0.59  s  to  1.16  s.  Thus,  most  whistles  extend  over  a 
sample  duration  of  98,000  to  193,000  data  samples.  As  a  result,  the  QTFR  implementation 
algorithms  must  be  able  to  handle  large  data  files.  The  spectrogram  is  computed  efficiently 
using  a  built-in  MATLAB  function;  however,  the  current  implementation  of  the  Wigner 
distribution  using  MATLAB  can  only  be  used  to  analyze  small  blocks  of  data  (block  lengths  of 
1024  samples).  Since  the  sampling  rate  of  the  data  (166.666  kHz)  is  at  least  four  times  the 
bandwidth  of  the  analog  data,  and  not  much  information  is  present  above  20  kHz  other  than  the 
broadband  clicks,  decimation  was  used  to  decrease  the  number  of  data  samples  as  it  allows 
Wigner  distributions  of  larger  sections  of  data  in  time  to  be  computed.  An  algorithm  was  also 
devised  to  compute  the  Wigner  distribution  of  small  records,  and  then  to  piece  the  resulting 
Wigner  distributions  into  one  large  time-frequency  representation.  This  increases  the  resulting 
block  section  that  can  be  analyzed  to  12,800  samples.  Note,  however,  that  the  resulting 
representation  is  not  equivalent  to  the  true  Wigner  distribution  of  the  entire  data  section,  as  the 
representation  does  not  include  the  cross  terms  between  any  two  data  blocks.  The  MATLAB 
function  for  this  representation  is  given  in  section  A.  5  in  the  appendix.  This  function  also 
demonstrates  the  use  of  decimation  to  reduce  the  data  samples. 


7.2.2  Smoothed  Pseudo-Wigner  Distribution 

Most  of  the  MATLAB  algorithms  presently  available  to  compute  various  QTFRs,  as 
shown  in  section  A.4  in  the  appendix,  can  only  process  short  signals  in  an  off-line  fashion. 
However,  the  mammal  sounds  from  the  SOUND  database  are  long  signals  with  a  sampling  rate 
of  166,666  samples  per  second.  For  example,  a  dolphin  whistle  of  only  2  s  is  considered  a  long 
signal,  as  it  consists  of  333,332  samples.  Thus,  to  analyze  long  segments  of  data,  new  methods 
of  computing  QTFRs  are  needed  for  real-time,  on-line  operations. 
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7. 2.2.1  Direct  Method  and  Short-Time  Technique.  Two  methods  for  computing  the  smoothed 
pseudo-Wigner  distribution  of  long  data  sets  are  discussed  below. 

1.  Direct  Method.  The  smoothed  pseudo-Wigner  distribution  is  a  smoothed  version  of 
the  Wigner  distribution  that  can  be  used  to  remove  cross  terms  when  multicomponent  or 
complicated  monocomponent  signals  are  analyzed.  As  defined  in  equation  (9),  the  SPWD  uses  a 
time-smoothing  window  g{t)  and  a  frequency-smoothing  window  H(f)  that  independently  control 
the  smoothing  in  the  time  and  frequency  directions.  The  current  MATLAB  program  computes 
the  SPWD  indirectly  by  first  multiplying  the  signal’s  narrowband  ambiguity  function  AFj^(t,  v) 
with  the  function  G(v)/?(t)  then  the  two-dimensional  Fourier  transform  of  the  product  is  taken. 
That  is, 


SFfVD^  {t,f)  =  £  ^G{v)h{r)AF^{T,v)e’'^^‘'-^^  dxdv , 


(35) 


where  the  ambiguity  function  is  the  two-dimensional  Fourier  transform  of  the  Wigner 
distribution  in  equation  (5),  yfF^.(T,v)  =  T  T  WD^{t,f)e~^^’'^^~^^dtdf;  G(v)  is  the  Fourier 

J-ao  J-oo 


transform  of  g(x)  and  h(x)  and  is  the  inverse  Fourier  transform  of  H  (/)  in  equation  (9).  This 
indirect  method  requires  the  whole  data  segment  (off-line)  in  order  to  compute  the  ambiguity 
function,  and  can  only  process  short  signals  as  the  result  is  saved  as  an  N  x  N  matrix,  where  N  is 
the  number  of  signal  samples.  In  order  to  enable  an  on-line  computation  of  the  SPWD,  a 
program  was  written  that  computes  the  SPWD  using  a  more  direct  method.  In  particular,  the 


local  autocorrelation  function  x 


t  +  - 


V 


( 

/  — 
2^ 


of  the  signal  jc(/)  is  first  computed  and  then 


convolved  with  the  time-smoothing  window  g(T)  of  length  Ng.  Since  the  length  of  the  window  is 
considerably  smaller  than  the  length  of  the  data,  i.e.,  «  N,  only  a  small  section  of  the  local 


autocorrelation  function  needs  to  be  computed.  In  addition,  the  local  autocorrelation  function 
can  be  updated  so  that  the  previous  value  can  be  used  to  compute  the  next  value.  The  result  of 
the  convolution  then  gets  multiplied  with  the  squared  magnitude  of  the  frequency-smoothing 
window  h(t)  of  even  length  Nh  and  the  Fourier  transform  of  the  product  is  obtained. 
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Thus,  although  the  SPWD  is  a  smoothed  version  of  the  Wigner  distribution  as  shown  in  equation 
(9),  it  can  also  be  computed  as 


SPWDJtJ)  ^  f 

J-oc 


J2, 


f- 


t  + ' 


2y 


g(t-i)di 


dx. 


(36) 


where  hif)  is  the  inverse  Fourier  transform  of  H{f)  in  equation  (9).  At  the  end  of  the 
computation,  the  SPWD  is  stored  in  an  W/,  x  W  matrix.  As  a  result,  since  not  all  the  data  is  used 
at  once  to  produce  each  time-slice  of  the  SPWD,  the  computation  is  performed  on-line  and  is  not 
computationally  intensive.  In  addition,  it  allows  for  larger  sections  of  the  data  to  be  analyzed. 
The  MATLAB  function  for  this  algorithm  is  given  in  section  A. 6  in  the  appendix.  Figures  10 
and  1 1  show  two  examples  of  using  the  SPWD  in  equation  (36)  to  analyze  large  sections  of  the 
mammal  sounds.  Figure  10  shows  the  SPWD  of  a  long-finned  pilot  whale  whistle  for  a  data 
segment  of  14,815  samples  (corresponding  to  0.8  s,  where  the  data  was  decimated  by  a  factor  of 
9  to  reduce  the  total  number  of  samples  without  aliasing).  The  hyperbolic  time-frequency 
signature  of  the  whistle  is  clearly  shown  by  the  SPWD,  without  suffering  from  any  inner  cross 
terms.  Here,  a  Blackman  time-smoothing  window  of  length  A/g=  114  and  a  Hamming  frequency¬ 
smoothing  window  of  length  TV*  =  200  were  used.  The  resulting  SPWD  was  a  100  x  14,815 
matrix,  since  only  the  positive  frequencies  were  obtained  as  the  analysis  signal  is  real.  Similarly, 
figure  1 1  shows  the  SPWD  of  a  spotted-dolphin  whistle  for  a  data  segment  of  37,500  samples 
(corresponding  to  0.45  s,  where  the  data  was  decimated  by  a  factor  of  2). 

2.  Short-Time  Technique.  Any  member  of  Cohen’s  class  in  section  3.1  can  be  obtained 
as  in  equation  (7).  Another  way  of  obtaining  Cohen’s  class  QTFRs  is  by  first 

computing  the  ambiguity  function  AF^( t,v)  of  the  data  and  multiplying  the  ambiguity  function 
with  a  two-dimensional  kernel  )  that  completely  characterizes  the  chosen  QTFR;  then 

the  two-dimensional  Fourier  transform  of  the  product  is  taken;  i.e. 
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Figure  10.  Direct  SPWD  of  a  Long-Finned  Pilot  Whale  Whistle  from  Data  File 
7703500w.kay  (Image  shows  the  whistle’s  hyperbolic  modulation  in  frequency.) 
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Figure  11.  Direct  SPWD  of  a  Spotted-Dolphin  Whistle  from  Data  File  9002709d.kay 
(Image  shows  the  whistle’s  power  modulation  in  frequency.) 
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(37) 
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where  the  kernel  x,  v^  is  the  two-dimensional  Fourier  transform  of  the  kernel  in 

'J^r^Y )  (equation  (7)).  Note  that  this  method  requires  all  the  data  to  be  present  in  an  off-line 
fashion.  Also,  due  to  the  computational  complexity,  only  short  signals  can  be  processed.  If  the 
signal  has  length  N  samples,  then  the  resulting  QTFR  has  dimensions  Nx  N.  However,  if  N  is 
large,  the  method  is  computationally  intensive  due  to  memory  constraints.  The  short-time 
technique,  on  the  other  hand,  yields  a  real-time,  on-line  operation,  as  it  does  not  require  the 
whole  signal  to  be  present  in  memory.  In  particular,  the  technique  sections  the  data  by 
windowing  it  using  a  rectangular  window  centered  at  time  t.  The  short-time  ambiguity  function 
is  then  computed  by  computing  the  ambiguity  function  of  each  windowed  section.  The  two- 
dimensional  Fourier  transform  of  the  product  of  the  short-time  ambiguity  function  with  the 
kernel  is  obtained,  but  only  the  time-slice  at  time  t  is  kept.  Thus,  there  is  no  need  to  compute  the 
other  time  slices,  resulting  in  memory  conservation  and  a  reduction  in  computational  time.  If  the 
signal  has  length  N  samples  (for  N  large)  and  a  window  of  length  Nw  is  used,  then  the  resulting 
QTFR  has  dimensions  N^/2  x  N/N^.  The  dimensions  are  much  smaller  than  Nx  N,  which 

would  result  from  an  off-line  operation;  thus,  memory  is  saved,  allowing  for  a  more  efficient 
computational  method.  Note  that  the  sections  may  be  overlapped  for  better  resolution  in  the 
time-frequency  plane. 

Although  the  short-time  technique  can  be  used  with  any  Cohen’s  class  QTFR,  it  is 
demonstrated  here  for  the  smoothed  pseudo-Wigner  distribution  (note  that  the  kernel  of  the 
SPWD  in  equation  (37)  is  'i^spfVD( =  G(v)h(T),as  shown  in  equation  (35)).  The  short-time 

SPWD  method  resulted  in  the  successful  analysis  of  many  long-duration  records  of  mammal 
sounds.  For  example,  in  figure  12  the  sounds  of  spotted  dolphins,  including  both  clicks  and  a 
whistle,  are  analyzed  using  the  short-time  SPWD.  The  whole  data  segment  ofN=  436, 160 
samples  (corresponding  to  2.6  s)  was  successfully  analyzed  and  the  resulting  SPWD  had 
dimensions  128  x  1,703,  since  a  window  of  length  N^  =  256  was  used.  The  short-time  SPWD 
clearly  shows  the  various  broadband  clicks  as  well  as  the  narrowband  whistle  (with  a 
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Figure  12.  Short-Time  SPWD  of  a  Spotted  Dolphin  Whistle  from  Data  File  83035024.kay 
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bandwidth  of  12  kHz  and  a  duration  of  1.1  s).  The  MATLAB  function  used  to  compute  the 
short-time  SPWD  is  given  in  section  A.7  in  the  appendix.  Note  that  the  direct  computation  of 
the  SPWD  as  discussed  above  requires  more  memory;  therefore,  it  cannot  handle  data  sets  as 
large  as  the  short-time  SPWD. 


7.3  HYPERBOLIC  CLASS  QTFRs  ANALYSIS  OF  WHISTLES 


Many  whistles  have  a  hyperbolic  time-frequency  characteristic  structure  (table  7).  This 
hyperbolic  structure  often  occurs  at  lower  frequencies  with  long-finned  pilot  whale  whistles, 
sinee  the  whistles  of  these  marine  mammals  are  emitted  at  lower  frequencies  than  the  other 
marine  mammals  in  the  data.  These  hyperbolic  whistles  start  at  high  frequencies  and  diminish  in 
frequency  in  a  1 //’fashion  during  the  duration  of  the  whistle.  Thus,  the  hyperbolic  class  QTFRs 
presented  in  section  4.1.3  would  be  well  matched  to  analyze  the  hyperbolic  whistles,  as  the 
whistle’s  group  delay  characteristics  are  matched  to  the  time  shift  x(f)  =  1 //"preserved  by 
hyperbolic  QTFRs.^^’^^ 


The  implementation  of  hyperbolic  QTFRs^^’  is  similar  to  the  implementation  of  power 
QTFRs  in  section  5,  as  it  is  based  on  the  warping  approach.  Any  member  of  the  hyperbolic  class 
can  be  obtained  by  warping  the  corresponding  member  of  Cohen’s  class  using  the  warping 
relation  in  equation  (15)  with  IJib)  -  \nb?^  For  example,  as  demonstrated  in  equation  (32),  the 
hyperbolic  class  Altes  Q-distribution  in  equation  (26)  can  be  obtained  by  warping  the  Cohen’s 
class  Wigner  distribution  in  equation  (5).  Specifically,  the  Altes  Q-distribution  of  a  signal  is 
obtained  by  first  warping  the  analysis  signal  in  the  frequency  domain  using  an  exponential 
frequency  axis  transformation,  then  by  computing  the  Wigner  distribution  of  the  warped  signal, 
and  lastly  by  transforming  the  time  and  frequency  axes  for  correct  time-frequency  localization.^^’ 

28  34 

’  To  demonstrate  the  advantage  of  hyperbolic  class  QTFRs  over  Cohen’s  class  QTFRs  when 
analyzing  signals  with  t-Mf  time-frequency  characteristics,  figure  13  analyzes  the  sum  of  two 


hyperbolic  impulses  defined  as  X(f)  = 


1 
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,  /  >  0  .  Note  that  a 


hyperbolic  impulse  is  defined  in  equation  (23)  with  ^{b)=\n{b). 
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Figure  13.  Time-Frequency  Analysis  of  the  Sum  of  Two  Hyperbolic  Impulses:  (a)  Wigner 
Distribution,  (b)  Smoothed  Pseudo-Wigner  Distribution,  (c)  Altes  Q-Distribution, 
and  (d)  Smoothed  Pseudo-Altes  Q-Distribution 


Hyperbolic  QTFRs  are  well  matched  to  signals  similar  to  hyperbolic  impulses  with 
hyperbolic  group  delay  structure.  Thus,  the  Altes  Q-distribution  (equation  (32))  in  figure  13(c) 
has  good  time-frequency  concentration  along  the  two  hyperbolae  t  =  2>lf  and  t-Hf  However,  it 
also  results  in  a  cross  term  along  the  mean  hyperbola^^  t  =  5lf,  as  shown  in  equation  (25)  with 
^(6)  =  InZj;  i.e.,  the  Altes  Q-distribution  is  the  sum  of  two  auto  terms  and  one  cross  term  given  by 
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The  smoothed  pseudo- Altes  Q-distribution  can  be  obtained  by  hyperbolically  warping  the 
smoothed  pseudo-Wigner  distribution  as  in  equation  (32).  As  is  shown  in  figure  13(d),  the 
smoothed  pseudo-Altes  Q-distribution  removes  the  cross  term  in  equation  (38)  with  only  some 
loss  of  time-frequency  resolution.  Cohen’s  class  QTFRs,  such  as  the  Wigner  distribution  in 
figure  13(a)  and  the  smoothed  pseudo-Wigner  distribution  in  figure  13(b),  are  not  well-matched 
to  hyperbolic  impulses.  The  Wigner  distribution  results  in  complicated  cross  terms  between  the 
two  signal  components  as  well  as  iimer  interference  terms.  In  comparison  to  the  smoothed 
pseudo-Altes  Q-distribution  in  figure  13(d),  the  smoothed  pseudo-Wigner  distribution  in  figure 
13(b)  has  a  larger  loss  of  time-frequency  resolution  and  is  not  as  successful  at  removing  all  of  the 
cross  terms. 

The  implementation  technique  for  hyperbolic  QTFRs  that  is  based  on  the  warping 
approach  depends  on  the  implementation  technique  for  Cohen’s  class  QTFRs,  together  with  a 
signal  mapping  implementation  and  an  axes  transformation.  Thus,  since  most  of  the  algorithms 
for  Cohen’s  class  QTFRs  only  can  efficiently  analyze  small  blocks  of  data,  similar  problems 
exist  for  hyperbolic  QTFRs.  Hyperbolic  QTFRs  are  more  computationally  involved  since,  for  a 
signal  of  length  N  samples,  the  corresponding  warped  signal  is  4  x  iV  samples;  therefore,  a 
Cohen’s  class  QTFR  has  to  be  computed  for  a  signal  that  is  four  times  longer  than  the  initial 
signal.^^  Thus,  in  order  to  be  able  to  analyze  the  long-duration  hjqjerbolic  whistles,  the  short- 
time  technique  was  applied  to  hyperbolic  QTFRs  such  as  the  Altes  Q-distribution  and  its 
smoothed  versions. 

Using  the  short-time  technique  for  hyperbolic  QTFRs,  the  long  data  was  first  divided  into 
short  sections.  A  hyperbolic  QTFR  of  each  time  section  was  computed  using  the  warping 
approach,  but  only  the  center  time  slice  corresponding  to  the  center  of  the  window  used  to 
section  the  data  was  saved.  The  hyperbolic  QTFR  was  computed  first  by  hyperbolically  warping 
the  signal,  computing  the  corresponding  Cohen’s  class  QTFR  of  the  warped  signal,  and  then 
transforming  the  time  and  frequency  axes  for  correct  time-frequency  localization.  Similar  to 
Cohen’s  class  QTFRs,  memory  was  saved  with  the  resulting  short-time  QTFR  having 
dimensions  of  N^/ 2  x  N/N^  (where  A  is  the  signal  length  and  Nw  is  the  length  of  the 
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sectioning  window);  the  dimensions  are  much  smaller  than  if  computed  directly  (with 
dimensions  N  x  N).  Since  implementation  of  the  hyperbolic  QTFRs  involves  computation  of 
Cohen’s  class  QTFRs  as  part  of  the  computation,  hyperbolic  QTFRs  are  more  computationally 
intensive.  As  a  result,  although  the  analysis  of  long  data  with  hyperbolic  QTFRs  was  possible,  it 
took  8-12  hours  to  compute.  Since  the  computation  time  is  not  practical,  it  was  evident  that  the 
algorithm  required  improvements  to  speed  up  processing;  additional  steps  were  taken  to  reduce 
the  computation  by  eliminating  unnecessary  processing  in  the  algorithm. 

The  existing  code  for  hyperbolic  QTFRs  was  used  originally  to  ensure  that  the  short-time 
implementation  was  possible.  However,  these  algorithms  compute  entire  representations.  The 
short-time  technique,  on  the  other  hand,  uses  only  the  center  time  slice  of  each  hyperbolic 
analysis  of  the  filtered  data.  Thus,  the  computation  of  all  the  time  slices  was  unnecessary.  Each 
step  of  the  algorithm  was  analyzed  to  determine  which  processing  was  redundant  or  unneeded. 
Overall,  the  computation  time  was  reduced  by  approximately  80%  (or  a  speedup  of  5.7);  the 
resulting  MATLAB  algorithm  to  compute  the  short-time  smoothed  pseudo-Altes  Q-distribution 
(SPAD)  is  given  in  section  A.  8  in  the  appendix. 

The  short-time  technique  was  used  with  the  SPAD  to  analyze  the  whistles  of  long-finned 
pilot  whales,  as  these  marine  mammals  emitted  whistles  with  a  hyperbolic  time-frequency 
structure  in  many  data  files.  For  example,  figure  14  analyzes  the  whistles  of  a  long-finned  pilot 
whale  using  a  short-time  SPAD.  The  whole  data  segment  ofN=  32,984  samples  was  analyzed 
(the  data  was  decimated  nine  times  and  it  corresponds  to  1.78  s),  and  the  resulting  SPAD  had 
dimensions  64  x  257.  The  short-time  SPAD  clearly  shows  three  hyperbolic  whistles  with  \/f 
characteristic  time-frequency  structure.  For  example,  the  first  hyperbolic  whistle  lasts  for  0.43  s 
starting  at  0.5 1  s.  It  has  a  bandwidth  of  1 .9  kHz  starting  with  an  upper  fi-equency  of  3.8  kHz  and 
ending  at  a  lower  frequency  of  1 .9  kHz.  The  computation  required  8.65  hours  with  the  original 
short-time  programming;  after  the  changes  were  made  to  increase  the  computational  speed,  the 
same  results  were  obtained  in  1.52  hours. 
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Figure  14.  Short-Time  SPAD  of  Long-Finned  Pilot  Whale  Whistles  from  Data  File 
7703501 l.kay  (Note  that  the  hyperbolic  time-frequency  structure 
of  the  whistles  is  matched  to  the  hyperbolic  QTFR.) 
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7.4  POWER  CLASS  QTFRs  ANALYSIS  OF  WHISTLES 


Many  marine  mammals  from  the  SOUND  database  emit  whistles  with  a  characteristic 
power  time-frequency  structure,  i.e.,  the  whistle’s  frequency  content  changes  in  a  power  fashion 
t  =  during  the  duration  of  the  whistle.  For  example,  initial  analysis  of  some  of  the  whistles 
of  spotted  dolphins  show  components  with  power  function  time-frequency  characteristics.  These 
types  of  signals  would  be  best  analyzed  using  power  class  QTFRs  with  variable  power  parameter 
K,  as  presented  in  section  4.2.3.  The  power  QTFR  would  be  a  good  analysis  tool  provided  that 

K-l 

preserved  by  the  power  QTFR  is  closely  matched  to  the  signal’s 

power  group  delay  function. 


the  time  shift  x(  f )  - 


K 

/ 

fr 

In  section  5.1,  the  discrete  implementation  of  power  class  QTFRs  was  presented  based  on 
the  fact  that  any  power  class  QTFR  can  be  obtained  by  warping  the  corresponding  affine  class 
QTFR  as  shown  in  equation  (33).  The  example  in  figure  15  demonstrates  the  advantage  of 
using  power  class  QTFRs  over  affine  class  QTFRs  when  analyzing  signals  with  t  =  group 
delay  time-frequency  characteristics.  Figures  15(a)  and  15(b)  show  the  results  obtained  with  the 
power  Wigner  distribution  (power  warped  version  as  in  equation  (33)  of  the  Wigner  distribution) 
and  a  smoothed  pseudo-power  Wigner  distribution  (power  warped  version  as  in  equation  (33)  of 
the  affine  smoothed  pseudo-Wigner  distribution)  with  a  very  short  window,  and  both  with  k  =  3 
for  a  two-component  signal  consisting  of  two  power  impulses,  windowed  in  the  frequency 
domain,  with  k  =  3.  The  Kth  power  impulse  is  defined  in  the  frequency  domain  as 

->4t] 

e  ^  , />0.  Note  that  the  power  impulse  is  the  generalized 

impulse  in  equation  (23)  with  i(b)  =  ijb) ,  and  the  power  parameter  k  of  the  two  power  QTFRs 

is  matched  to  that  of  the  power  impulses.  The  power  Wigner  distribution  has  very  good  time- 
frequency  concentration,  but  it  also  results  in  a  cross  term  as  in  equation  (25)  with 

i(b)  =  ^^(b)  =  sgn(b)  Iftp.  This  cross  term  is  effectively  suppressed  in  the  smoothed  pseudo¬ 
power  Wigner  distribution  with  minimal  loss  of  time-frequency  concentration. 


<fr  ) 
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Figure  15.  Power  Class  Analysis  of  Two  Power  Impulses:  (a)  Power  Wigner  Distribution  with 
K  =  5,  (b)  Smoothed  Pseudo-Power  Wigner  Distribution  with  k  =  5,  (c)  Wigner  Distribution, 
(d)  Affine  Smoothed  Pseudo-Wigner  Distribution  of  a  Two-Component  Signal  Consisting  of 
Two  Windowed  Power  Impulses  with  k  =  5,  (e)  Power  Wigner  Distribution  with  k  =  5,  and 
(f)  Smoothed  Pseudo-Power  Wigner  Distribution  with  k  =  3  of  a  Two-Component  Signal 
Consisting  of  Two  Windowed  Power  Impulses  with  k  =  </  (Duration  of  signals  is  128  samples.) 
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Figures  15(c)  and  15(d)  show  the  results  obtained  with  the  Wigner  distribution  and  an 
affine  smoothed  pseudo-Wigner  distribution,  both  members  of  the  affine  class.  The  Wigner 
distribution  in  figure  15(c)  is  not  matched  to  the  power  impulses  as  evidenced  by  complicated 
cross  terms.  The  affine  smoothed  pseudo-Wigner  distribution  in  figure  15(d)  does  not  remove 
all  the  cross  terms  and  has  a  larger  loss  of  time-frequency  concentration  than  the  smoothed 
pseudo  power-Wigner  distribution  in  figure  15(b).  The  results  of  the  two  power  QTFRs  with 
K  =  3  in  figures  15(a)  and  15(b)  are  better  than  those  of  the  corresponding  two  affine  class 
QTFRs  in  figures  15(c)  and  15(d)  because  the  former  are  optimally  matched  to  the  power-law 
group  delays  of  the  two  power  impulse  signal  components.  In  order  to  demonstrate  the  effect  of 
a  slightly  mismatched  power  parameter  k,  figures  15(e)  and  15(f)  show  the  results  obtained  when 
analyzing  two  power  impulses  with  k  =  4  (instead  of  k  =  3)  using  the  power  Wigner  distribution 
and  a  smoothed  pseudo-power  Wigner  distribution  with  k  ==  3.  The  power  QTFRs  with  k  =  3 
used  in  figures  15(e)  and  15(f)  are  the  same  as  those  used  in  figures  15(a)  and  15(b), 
respectively,  but  the  two  power  impulses  now  have  power  parameter  k  =  4  instead  of  k  =  3;  thus, 
the  power  parameters  of  the  signal  and  the  power  QTFRs  are  different.  The  results  are  better 
than  those  of  the  Wigner  distribution  and  the  affine  smoothed  pseudo-Wigner  distribution  in 
figures  15(c)  and  15(d)  due  to  the  fact  that  the  power  parameter  mismatch  in  figures  15(e)  and 
15(f)  is  smaller  than  that  in  figures  15(c)  and  1 5(d).^^’ 

The  power  class  as  defined  in  section  4.2.3  is  obtained  by  warping  the  affine  class.  The 
power  time-shift  covariance  property  in  the  power  class  is  useful  in  analyzing  signals  whose 
group  delay  is  matched  to  the  power  time  shift.  The  power  class  QTFRs  also  satisfy  the  scale 
covariance  property  that  is  important  in  multiresolution  analysis  applications.  It  is,  however, 
possible  to  warp  Cohen’s  class  using  the  same  power  warping  as  in  equation  (33)  to  obtain  the 
power  warped  Cohen’s  class,"**’ which  is  an  example  of  a  generalized  warped  Cohen’s  class  in 
section  4.1.3  with  ^(b)  =  as  shown  in  table  4.  The  Kth  power  warped  Cohen’s  class  also 

satisfies  the  Kth  power  time-shift  covariance  property,  but  it  does  not  satisfy  the  scale  covariance 
property.  Thus,  if  an  application  does  not  require  multiresolution  analysis,  but  it  is  important  to 
match  the  power  group  delay  of  a  signal  (as  in  the  case  of  the  analysis  of  power  whistles),  then 
either  a  power  QTFR  (power  warped  affine  QTFR)  or  a  power  warped  Cohen’s  class  QTFR  may 
be  used. 
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In  order  to  analyze  some  of  the  spotted-dolphin  whistles,  the  short-time  technique  was 
implemented  for  the  Kth  power  QTFRs  to  enable  large  sections  of  the  data  to  be  analyzed  in  an 
on-line  fashion.  Using  the  short-time  technique  for  power  QTFRs,  the  long  data  was  first 
divided  into  short  sections,  and  then  a  power  QTFR  of  each  time  section  was  computed  using  the 
warping  approach  in  equation  (33)  by  warping  Cohen’s  class  QTFRs  or  affine  class  QTFRs. 
When  the  power  QTFR  of  the  warped  data  section  was  obtained,  only  the  center  time  slice  was 
saved,  which  corresponded  to  the  center  of  the  window  used  to  section  the  data.  Similar  to  the 
short-time  hyperbolic  technique  discussed  in  section  7.3,  the  computational  time  was  also 
reduced  to  speed  up  processing.  The  resulting  MATLAB  algorithm  for  the  short-time  power 
smoothed  pseudo-Wigner  distribution  (which  is  the  power  warped  version  of  Cohen’s  class 
smoothed  pseudo-Wigner  distribution)  is  given  in  section  A.9  in  the  appendix.  Note  that  as  the 
whistles  of  the  spotted  dolphins  appear  to  start  at  much  higher  frequencies  than  the  hyperbolic 
whistles  of  the  long-fmned  pilot  whales  (tables  8  and  10),  the  data  for  the  spotted  dolphins 
cannot  be  decimated  by  a  large  factor  for  fear  that  high-frequency  information  would  be  lost.  As 
a  result,  the  number  of  samples  for  the  spotted-dolphin  whistles  is  very  large,  and  although  the 
computational  savings  have  been  applied  to  the  short-time  power  QTFR,  the  computational  time 
is  still  considerable.  Using  the  short-time  power  QTFR  implementation  with  a  k  =  2  power 
smoothed  pseudo-Wigner  distribution,  125,000  samples  were  analyzed  from  a  spotted-dolphin 
whistle  (decimated  by  a  factor  of  2)  from  data  file  83035024.kay.  The  computational  time  was 
6.4  hours  and  the  resulting  analysis  with  a  power  group  delay  time-frequency  characteristic  is 
shown  in  figure  16. 
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Figure  16.  Short-Time  Power  Smoothed  Pseudo-Wigner  Distribution  with  k= 2  of  a  Spotted- 
Dolphin  Whistle  from  Data  File  8303S024.kay  (Note  that  the  power  time-frequency  structure 

of  the  whistle  is  matched  to  the  power  QTFR.) 
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7.5  ADAPTIVE  OPTIMAL  KERNEL  QTFR  ANALYSIS  OF  WHISTLES 


In  the  analysis  of  whistles  consisting  of  long  data  segments,  short-time  time-frequency 
techniques  were  used  to  enable  computations  of  large  data  on-line,  such  as  short-time  Cohen, 
hyperbolic,  and  power  QTFRs.  These  short-time  techniques  used  QTFRs  whose  kernels  were 
fixed  for  the  whole  signal  analysis.  However,  when  the  signal  characteristics  change  with  time, 
new  adaptive  techniques  that  use  QTFRs  whose  kernels  change  as  the  signal  changes  are  used 
for  a  better  time-frequency  analysis.  When  combined  with  the  short-time  techniques,  the 
adaptive,  signal-dependent  kernel  QTFRs  can  compute  various  whistle  characteristics  in  an  on¬ 
line,  fast  fashion.  In  particular,  the  short-time  adaptive  radially  Gaussian  QTFR  (AOK)*^^adapts 
the  kernel  to  each  set  of  signal  components,  resulting  in  a  highly  concentrated  time-frequency 
representation  that  is  best  matched  to  linear  chirps.  A  copy  of  the  author’s  C  language  program 
has  been  obtained  for  the  AOK,  and  has  been  applied  to  the  whistles  from  the  marine  mammal 
database.  As  expected,  the  compiled  C  program  is  very  fast  as  compared  to  the  interpreted 
MATLAB  programs.  The  intention  was  to  apply  hyperbolic  and  power  warping  to  the  AOK 
program  in  order  to  match  the  AOK  to  hyperbolic  and  power  impulses.  It  has  been  discovered, 
however,  that  for  long  data,  the  short-time  characteristic  of  this  method  looks  at  segments  of  the 
data  at  a  time  that  can  be  approximated  to  linear  chirps.  In  other  words,  for  a  long  section  of  data 
(corresponding  to  one  whistle),  even  if  the  whistle  has  a  hyperbolic  time-frequency  structure, 
each  short  segment  in  the  short-time  adaptive  AOK  appears  linear.  As  a  result,  since  the  AOK  is 
best  matched  to  linear  chirps,  the  results  obtained  are  satisfactory.  It  is  not  believed  that  the 
warping  will  give  any  better  results;  on  the  contrary,  it  will  try  to  analyze  each  short  segment  as  a 
hyperbolic  impulse  (when  it  is  actually  a  linear  chirp)  and  the  warping  will  increase  the 
computational  time.  To  demonstrate  the  analysis  obtained  by  the  AOK,  three  long-finned  pilot 
whale  whistles  from  data  file  7703501 1  .kay  were  considered.  This  file  has  been  previously 
analyzed  using  the  hyperbolic  short-time  smoothed  pseudo-Altes  Q-distribution  in  figure  14. 
Although  good  results  have  been  obtained  using  the  short-time  smoothed  pseudo-Altes  Q- 
distribution,  as  the  whistles  have  hyperbolic  time-frequency  structure,  the  computation  took  1.5 
hours  as  discussed  in  section  7.3.  Using  the  C  program  for  the  short-time  adaptive  radially 

*  The  radially  Gaussian  QTFR  is  not  a  member  of  Cohen’s  class  of  time-shift  and  frequency-shift  covariant  QTFRs, 
since  its  kernel  depends  on  the  analysis  signal.  However,  similar  to  Cohen’s  class  QTFRs,  it  is  matched  to  signals  with  constant 
or  linear  group  delay  characteristics. 


91 


Frequency  (Hz) 


Gaussian  kernel,  similar  results  were  obtained  (figure  17)  because  the  AOK  is  matched  to  linear 
portions  of  short  segments  of  the  hyperbolic  curve.  The  advantage  of  using  this  method  is  that 
the  computation  now  only  takes  0.3  hours.  The  difference  in  resolution  in  figures  14  and  17  is 
due  to  the  smoothed  pseudo- Altes  Q-distribution  in  figure  14  using  two  smoothing  windows  that 
limit  the  time-frequency  resolution,  whereas  the  AOK  in  figure  17  adapts  to  the  signal  changes. 


Time  (s) 

Figure  1 7.  Short-Time  Adaptive  Radially  Gaussian  Kernel  QTFR  of  Long-Finned  Pilot 
Whale  Whistles  from  Data  File  77035011.kay 
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8.  CONCLUSIONS 


Marine  mammal  sounds  are  time-varying  signals  whose  frequeney  content  changes  with 
time.  As  a  result,  time-frequency  analysis  techniques  are  ideal  tools  for  analyzing  such  signals, 
since  time-frequency  representations  provide  temporal  localization  of  the  time-varying  signal’s 
spectral  components.  In  this  project,  QTFRs  were  used  to  analyze  the  time-frequency  structure 
of  cetacean  mammal  sounds,  such  as  clicks  and  whistles  emitted  by  dolphins  and  whales.  An 
investigation  was  proposed  to  determine  signal  types  and  their  properties,  and  to  investigate 
which  types  of  QTFRs  are  best  suited  for  analyzing  these  biological  signals  based  on  the 
properties  of  the  signals. 

As  proposed,  sonar  signals  (clicks)  that  are  used  by  the  marine  mammals  for  echolocation 
were  analyzed.  The  analysis  tools  used  were  Cohen’s  class  QTFRs,  such  as  the  Wigner 
distribution,  the  pseudo-Wigner  distribution,  and  the  spectrogram.  Cohen’s  class  QTFRs  were 
used  because  clicks  were  found  to  be  short-duration,  broadband,  transient-like  pulses  with 
constant  group  delay  characteristics  and  Cohen’s  class  QTFRs  are  matched  to  signals  with  linear 
or  constant  group  delay  characteristics,  as  they  preserve  the  signal’s  constant  time  shifts.  As 
proposed,  non-echolocation  signals,  called  whistles,  which  are  used  by  marine  mammals  for 
communication,  were  also  analyzed.  The  whistles  were  found  to  be  long-duration,  narrowband, 
frequency-modulated,  continuous  tonal  sounds  that  vary  in  frequency,  duration,  and  in  the  shape 
of  the  whistle  time-frequency  structure,  depending  on  the  nature  of  the  signal’s  frequency 
modulation.  As  a  result,  the  marine  mammals  have  different  signature  whistles  whose  group 
delay  time-frequency  characteristics  included  linear,  hyperbolic,  exponential,  and  power 
structures.  Based  on  this,  the  generalized  time-shift  covariant  QTFRs  discussed  in  this  report 
would  be  well-matched  for  analyzing  the  different  types  of  whistles  simply  by  choosing  the 
corresponding  group  delay  time-frequency  structure  of  the  whistle  and  matching  it  to  the 
QTFR’s  time  shift.  In  particular,  it  has  been  shown  that  many  whistles  of  long-finned  pilot 
whales  have  2it=\lf  group  delay  characteristic  that  was  successfully  analyzed  using  hyperbolic 
QTFRs,  i.e.,  QTFRs  that  preserve  hyperbolic  time  shifts  on  the  analysis  signal.  Initial  analysis  of 
some  of  the  whistles  of  spotted  dolphins  show  components  with  power  function  t  =  time- 
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frequency  characteristics.  Thus,  it  has  been  shown  that  power  QTFRs  are  matched  to  the 
analysis  of  these  whistles  as  they  preserve  the  signal’s  power  changes  in  group  delay.  Given  a 
marine  mammal  whistle,  some  preprocessing  was  necessary  to  determine  its  type  of  time- 
frequency  structure.  This  preprocessing  was  usually  performed  with  the  spectrogram  and  was 
based  on  visual  conclusions. 

The  MATLAB  code  needed  for  the  analysis  using  Cohen’s  class  QTFRs  and  hyperbolic 
class  QTFRs  already  existed.  But,  as  proposed,  the  MATLAB  code  for  other  QTFRs  that  were 
needed,  such  as  the  code  for  the  power  class  QTFRs  and  the  code  for  the  exponential  class 
QTFRs,^^’'’^  were  developed  herein.  During  the  duration  of  the  project,  the  relevant  theory  that 
was  developed  in  relation  to  the  generalized  time-shift  covariant  QTFRs  was  recorded.^^'^'*’ 

The  major  problem  encountered  was  the  fact  that  the  sample  duration  of  the  whistles  was 
very  large,  and  the  MATLAB  code  for  the  various  QTFRs*  suffered  from  memory  management 
problems,  with  the  system  memory  quickly  consumed.  This  problem  was  addressed  by  using 
short-time  time-frequency  analysis  techniques  for  Cohen’s  class,  hyperbolic  class,  and  power 
class  QTFRs.  The  short-time  technique  yields  a  real-time,  on-line  operation  that  does  not  require 
the  whole  signal  present  in  memory.  Thus,  the  long  data  can  be  processed  by  the  appropriate 
QTFRs  without  running  out  of  system  memory.  Short-time  data-adaptive  QTFRs  were  also 
used,  as  they  adapt  to  changes  in  the  signal’s  time-frequency  structure. 


*  The  spectrogram  implementation  was  relatively  fast  as  a  MATLAB  built-in  function  was  used  for  its  computation. 
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APPENDIX 

MATLAB  PROGRAMS 


This  appendix  comprises  the  essential  MATLAB  programs  used  for  the  project. 

A.1  MATLAB  FUNCTION  TO  EXTRACT  SOUND  DATABASE  FILE  HEADER 
AND  READ  THE  DATA  (loadkay.m) 


function  [y, header, Fs, Ns]  =  loadkay(filename,N, offset) 

%  LOADKAY  Reads  WHOI  .kay  files.  Reads  N  samples  from  filename  starting  at 
%  offset,  returns  the  file  header,  the  data  vector  y,  the  sampling  rate  Fs 

%  samples/sec  and  the  total  number  of  samples,  Ns. 

%  Antonia  Papandreou-Suppappola,  22Jan96 

%  Open  file,  return  file  descriptor 
fid  =  fopen(filename,’r’); 
if  fid  <  0, 

error(sprintf(‘Unable  to  open  file:  %s’,  filename)); 
end 

%  Read  header  in  first  512  bytes  of  digitized  file 
header  =  ffead(fid,512,’char’); 

%  Remove  nulls 

i=find(header=0) ;  header(i)=3  0  *ones(size(i)) ; 

%  Move  each  field  to  a  separate  line,  they  are  separated  by  in  file 

i=find(header=‘_’); 

header(i)=l  0*ones(size(i)); 

i=find(header=‘R’); 

header(i(  1 )- 1  )=1 0*ones(  1,1); 

%  Determine  the  sampling  rate,  Fs  samples/sec 
i=find(header=‘ S’);  Fs=header(i(l )+3 :i(  1  )+8); 

Fs=setstr(Fs’);  Fs=str2num(Fs); 

%  Add  the  word  ‘Samples  ‘  before  the  actual  #  of  data  samples 
header(l:8)=‘Samples  ‘; 

%  Move  the  #  of  samples  next  to  its  field  title 
header(9:26)-header(27:44);  header(27:38)=15*ones(l,12); 

%  Extract  #  of  samples.  Ns 
Ns=str2num(setstr(header(9: 14)’)); 

%  Display  header  in  ASCII  format 
header  =  setstr(header’); 

%disp(‘  ‘);  disp(header); 
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%  Read  N  samples  of  data 
if  nargin==l, 
y=fr  ead(fid,  ’  short  ’ ); 
elseif  nargin==2, 
y=fread(fid,N, ’short’); 
elseif  nargin==3, 
fseek(fid, offset*  2 ,  ’  cof ); 
y=ff  ead(fid,N,  ’  short’ ) ; 
else 

error(‘Too  many  input  arguments’) 
end 

%  close  file 
fclose(fid); 
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A.2  MATLAB  FUNCTIONS  TO  IMPLEMENT  THE  POWER  WIGNER 
DISTRIBUTION 

A.2.J  Function power_WD.m 


function  [PWD]  =  power_WD(x, kappa); 

%  POWER  WD  This  function  computes  the  kappa  power  Wigner  distribution  (PWD)  of 
%  the  time  domain,  analytic  signal  x  (that  is,  the  Fourier  transform  of  x  is 

%  zero  for  negative  frequency).  The  resulting  PWD  is  an  N  x  N  matrix, 

%  where  N  is  the  length  of  x.  To  increase  computational  speed,  N  must  be  a 

%  power  of  2. 

%  Antonia  Papandreou-Suppappola  26April96 

%  Power  warp  the  signal  in  the  frequency  domain 
[warptimesig,  M,  df,  dv,  length2]=prewarp_PC(x,kappa); 

%  Compute  the  Wigner  distribution,  wd  warp,  of  the  warped  signal,  warptimesig 
wd_warp=wd(warptimesig); 

%  Transform  the  time  and  frequency  axes  of  the  Wigner  distribution  to  obtain  PWD 
PWD=unwarp_PC(wd_warp,  M,  df,  dv,  length2,  kappa); 

%  Plotting  options 
df=0.5/size(PWD,2);  f=0:df:0.5-df; 

ml=max(max(PWD));  contour_level=[0. 1  *ml  :0.2*ml  :0.9*ml ]; 

%  For  contour  plot 

subplot(31 1);  contour(l:size(PWD,2),f,flipud(PWD),contour_level); 

title(‘Power  Wigner  distribution’); 

xlabel(‘time  samples’);  ylabel(‘norm.  frequency’); 

%  For  mesh  plot 

subplot(3 1 2);  mesh(l  :size(PWD,2),f,flipud(PWD)); 

title(‘Power  Wigner  distribution’); 

xlabel(‘time  samples’);  ylabel(‘norm.  frequency’); 

%  For  imagesc  plot 
subplot(313) 

imagesc(l:size(PWD,2),f,flipud(PWD),contour_level);  axis  xy;  colormap(jet); 

title(‘Power  Wigner  distribution’); 

xlabel(‘time  samples’);  ylabel(‘norm.  frequency’); 
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A.2.2  Function prewarp  PC.m 


function  [warptimesig,  M,  df,  dv,  length2]=prewarp_PC(input, kappa); 

%  PREWARP  PC  Function  that  power  warps  with  power  kappa>l  the  input  time  signal  to 
%  obtain  the  time  signal,  warptimesig.  M  is  the  number  of  samples  in  the 

%  warped  DFT  of  the  input  signal  (before  zeropadding),  df  is  the  frequency 

%  sampling  rate  at  which  the  input  signal  is  assumed  to  be  sampled,  and  dv 

%  is  the  frequency  sampling  rate  at  which  the  warped  signal  is  assumed  to  be 

%  sampled.  Iength2  is  the  length  of  warptimesig. 

%  Antonia  Papandreou-Suppappola  March95  for  power  classes 
%  Kyle  Canfield  1991  for  hyperbolic  class 

u  =  8;  %  Set  upsampling  rate  for  DFT  of  input  signal 

L  =  length(input);  %  Set  length  of  input  signal 

N  =  u*L;  %  Set  length  of  zeropadded  input  signal 

inputzero  =  [input  zeros(size(l  :N-L))];  %  Initialize  zeropadded  input 

fftinsig  =  fft(inputzero);  %  Determine  DFT  of  zeropadded  input  signal 

%  Determine  M,  df,  dv 
df  =  2/L; 

M  =  2*N/u; 

dv  =  ( (df/u)*N/2  )^(1 /kappa)  /  M; 

%%  To  ensure  that  no  aliasing  occurs  in  the  warped  signal 
%form=  1:M-1, 

%  test(m)  =  ((m+l)*dv)^(l/kappa)  -  (m*dv)^(l /kappa); 

%  if  test(m)  >  df 

%  disp(‘THERE  WILL  BE  ALIASING  IN  THE  WARPED  SIGNAL! !  ’) 

%  break 
%  end 
%end 

%  Weight  X(f)  by  inverse  of  square  root  of  characteristic  group  delay  function 
groupdelay  =  0; 
for  count  =  0:(N/2)-l 

group_delay(count+l)  =  kappa  *  (df'  (count+1)  )^(kappa-l); 
fftinsig(count+l)  =  fflinsig(count+l)/sqrt(group_delay(count+l)); 
end 

%  Determine  the  samples  of  the  warped  DFT  of  the  input  signal 
fftexpinsig(l)  =  fftinsig(l); 
for  count  =  1:M-1 

est  =  (  ((count+l)*dv)^(l/kappa)  )/dPu; 
estint  =  floor(est); 

dif  =  ( fftinsig(estint+l)-fftinsig(estint)  )*(est-estint); 
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fftexpinsig(count+l)  =  dif  +  fftinsig(estint); 
end 

%  Shift  the  samples  of  warped  DFT  of  the  input  signal  into  an  actual  DFT  form 

%  and  bandlimit  to  1/4  of  sampling  rate 

fftwarp=fftshift([zeros(l,M/2),  fftexpinsig,  zeros(l,M/2)]); 

warptimesig  =  ifft(fftwarp);  %  Find  a  time-domain  version  of  the  warped  DFT 

length2=(L/2)*4*2; 


A.2.3  Function  wd.m 


function  [W]=wd(x,NF) 

%  WD  This  function  calculates  the  Wigner  distribution  (W)  of  the  signal  x  where  x  must  be 
%  a  column  vector.  W  is  a  real  N  x  NF  matrix,  where  N=length(x).  If  NF  is  not  specified, 

%  then  NF  =  N  provided  that  N  is  even;  otherwise,  NF  =  N  +  1.  For  speed  purposes, 

%  NF  should  be  a  power  of  2 ! 

%  Antonio  H.  Costa,  April  2,  1 992 

[rows  cols]  =  size(x); 

N  =  length(x); 
if  cols  ~=  1,  x=x.’;  end 
if  nargin  =  1, 
if  rem(N,2)  =  1, 

NF  =  N+  1; 
else 

NF  =  N; 
end 

elseif  nargin  =  2, 
if  rem(N,2)  ==  0, 
if  NF  <  N, 

error(‘Number  of  FFT  points  must  be  >=  to  the  length(x)’); 
end 
else 

if  NF  <  N, 

error(‘Length(x)  cannot  be  greater  than  the  number  of  FFT  points’); 
end 
end 
else 

error(‘Only  1  or  2  arguments  allowed!...’); 
end 

M  =  N- 1; 

L  =  N/2; 
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X  =  [zeros(L-l,l);x;zeros(L+l,l)]; 
q  =  zeros(N,N); 
for  j  =  1:N, 

q(2:Nj)  =  xO:M+j-l)  .*  conj(x(M+j-l:-l;j)); 
end 

q=[q(fix(N/2)+l:N,:);  zeros(NF-N,N);  q(l:fix(N/2),:)]; 
W  =  2  *  fft(q); 

W  =  real(fftswap(W)); 

W  =  flipud(W); 


A.2.4  Function Jftswap.m 
function  [r]=fftswap(A) 

%  FFTSWAP  Swaps  the  order  of  the  rows  of  matrix  A. 

[rows  cols]  =  size(A);  r  =  [A(fix(rows/2)+l:rows,:);A(l:fix(rows/2),:)]; 


A. 2. 5  Function  unwarp_PC.m 


function  [PC_QTFR]=unwarp_PC(gg,  M,  df,  dv,  length2,  kappa); 


% 

% 

% 

% 

% 

% 

% 

% 

% 

% 


UNWARP  PC 


%  Function  that  transforms  the  time-frequency  axes  of  the  affine  class 
QTFR  of  the  warped  signal,  gg,  for  correct  time-frequency  localization  to 
obtain  the  corresponding  kappa  power  class  QTFR,  PC  QTFR.  M  is  the 
number  of  samples  in  the  warped  DFT  of  the  input  signal  (before 
zeropadding),  df  is  the  frequency  sampling  rate  at  which  the  input 
signal  is  assumed  to  be  sampled,  and  dv  is  the  frequency  sampling  rate 
at  which  the  warped  signal  is  assumed  to  be  sampled.  Iength2  is  the  length 
of  the  warped  time  signal.  Note  that  the  affine  QTFR  of  the  warped 
signal  has  dimensions  length  2  x  length2  and  the  resulting  PC  QTFR 
has  dimensions  length2/4  x  length2/4. 


%  Antonia  Papandreou-Suppappola  June95  for  power  classes 
%  Kyle  Canfield  1991  for  hyperbolic  class 


L=length2/4;  %  Determine  dimensions  of  power  QTFR  based  on  affine  QTFR  dimensions 
start  =  1 ;  stop  =  L-1 ;  %  Initialize  start  and  stop  variables 
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%  Find  matrix  alt  with  correct  frequency  locations  of  input  signal  components 
for  count=0:length2-l 

countd=count+l;  %  Set  up  time  sample  counter  from  1  to  length2 
for  coimt2=l:  start 
count  1  =L-count2+ 1 ; 

alt(countl,  countd)=0;  %  Zero  out  frequencies  in  alt  not  to  be  calculated,  ones  with  index  L 
end 

for  count2=start:stop 

est=((count2*df/2)^kappa)/dv;  %  Find  the  frequency  sample  on  [1  ,M]  to  which 

%  the  frequency  sample  count2  in  alt  corresponds 
est=est+length2/4-M;  %  Compensate  for  the  frequency  shifting  done 

%  to  the  warped  signal  in  PREWARP.  As  a  result, 

%  est  in  on  the  interval  [-length2/4,  length2/4] 
est=est*2;  %  Now  est  is  on  the  interval  [-length2/2,  length2/2] 
est=est+length2/2;  %  Now  est  is  on  the  interval  [0,  length2]  and 
%  corresponds  to  a  frequency  in  gg 
estint=floor(est);  %  Find  the  largest  integer  smaller  than  est 
est=length2-est+l ;  %  Correct  est  to  correspond  to  the  right  frequency 

%  as  indexed  by  Matlab 

estint=length2-estint+l;  %  Determine  two  integers  (estint  and  estint+1) 

%  between  which  est  lies.  These  integers 
%  correspond  to  frequency  samples  of  gg  between 
%  which  UNWARP  will  interpolate  to  get  the 
%  correct  value  for  the  sample  of  interest  in  alt 
count  l=L-count2+l;  %  Correct  the  frequency  index  of  alt  to  correspond  to 
%  the  indexing  system  used  by  Matlab 
alt(countl-l,  countd)  =  0;  %  initialize  the  new  matrix  values  to  zero 

%  note  that  countd  is  the  current  time  sample 

if  estint>0 

if  estint<=length2  %  Be  sure  estint  is  within  the  index  limits  of  gg 
diff=(gg(estint-l,  count+l)-gg(estint,  count+l))*(est-estint); 
alt(countl-l,  countd)  =  diff+gg(estint-l,  count+1); 

%  Interpolate  between  the  two  integers  estint  and  estint+1  to  find 
%  the  correct  value  for  the  sample  of  interest  in  alt 
end 
end 
end 
end 

%  Find  a  matrix  PC  QTFR,  the  power  QTFR,  which  shows  the  correct  time  and  frequency 
%  locations  of  signal  components  in  the  input  signal 
PC_QTFR=zeros(L,L);  %  Initialize  PC_QTFR 
for  count  1=1  :L  %  time  samples 
for  count2=l:L  %  frequency  samples 
freqcnt=L-count2+l ;  %  Correct  the  frequency  index  to  correspond 

%  to  Matlab ’s  indexing  system;  freqcnt=frequency  row 
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expand  1  =freqcnt/L; 

expand=kappa*expand  1  ^(kappa- 1 );  %  Determine  amount  to  expand  freqcnt 

expand=l /expand;  %  expand  factor  is  inverted  since  time  is  inversely 
%  proportional  with  frequency  relation 
arg  =  (count  1  - 1  )*expand+ 1 ;  %  Find  a  sample  in  alt  that  corresponds 

%  to  the  sample  of  interest  in  PC  QTFR 
intarg  =  floor(arg);  %  Determine  two  integers  (intarg  and  intarg+1) 

%  between  which  arg  lies.  These  integers  correspond 
%  to  time  samples  of  alt  between  which  UNWARP  will 
%  interpolate  to  get  the  correct  value  for  the 
%  sample  of  interest  in  PC_QTFR 
if  intarg>=l  %  Be  sure  intarg  is  within  the  index  limits  of  alt 
if  intarg  <  length2 

diff2=(alt(count2,intarg+l)-alt(count2,intarg))*(arg-intarg); 
PC_QTFR(count2,  count  1)  =  diff2+alt(count2, intarg); 

%  Interpolate  between  the  two  integers  intarg,  intarg+1  to  find  the 
%  correct  value  for  the  sample  of  interest  in  PC  QTFR 
end 
end 

if  intarg<l 

PC_QTFR(count2,countl  )=alt(count2, 1 ); 

%  Use  sample  value  at  first  time  sample  if  intarg  <  1,  in  which  case 
%  there  is  no  sample  at  zero  to  use  in  interpolating  between  zero  and  one 
end 
end 
end 
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A.3  MATLAB  FUNCTION  TO  PLOT  A  SECTION  OF  THE  DATA,  ITS  FOURIER 
TRANSFORM,  WIGNER  DISTRIBUTION,  AND  SPECTROGRAM 
(dataanalysis.m) 


fiinction  [y,Y,WD,SPEC] 
%  DATA_ANALYSIS 

% 

% 

% 

% 

% 

% 

% 

% 


data_analysis(filename,N, offset) 

Reads  and  plots  data,  computes  and  plots  Fourier  transform  (FT), 
Wigner  distribution  (WD)  and  spectrogram.  Reads  N  samples  from 
filename  starting  at  offset  sample,  returns  the  data  y,  its  FT  Y,  its 
WD,  and  its  spectrogram,  SPEC.  To  start  at  the  beginning 
of  the  data,  offset  is  set  to  zero.  To  read  all  the  data,  omit  N  and 
offset.  When  all  the  data  is  read,  the  WD  is  not  plotted.  Instead, 
the  data  is  decimated  to  a  sampling  frequency  of  8,192  Hz  to 
sound  as  actually  recorded. 

The  WD  algorithm  is  not  computationally  efficient  for  N  >  1024. 


%  Antonia  Papandreou-Suppappola,  29Jan96 


%  Obtain  the  header  information  of  the  digitized  file, 
%  read  data  vector  y  of  length  N  and  sampling  rate,  Fs 
if  nargin  ==  1 , 

[ys,header,Fs,Ns]=loadkay(filename); 

N=length(ys); 

disp(‘  ‘);  disp(header); 
else 

[ys,header,Fs,Ns]=loadkay(filename,N,offset); 

disp(‘  ‘);  disp(header); 
end 


%  Reset  axis 
clf;subplot(l  1  l);gcf; 

%  To  remove  really  low  frequencies  due  to  engine  noise 
b=firl(40,0.08,’high’);  %  High  pass  filter 

%  To  check  frequency  response 
[h,w]=freqz(b,  1,256);  plot(w,abs(h)) ; 
y=conv(ys,b);  y=y(20:N+20-l); 

%  Plot  the  data 
Ts=l/Fs; 

t=  [offset;  (N+offset- 1 )]  *Ts; 
subplot(411); 

plot(t,  y);  xlabel(‘time,  s’);  ylabel(‘amplitude’); 
title(sprintf(‘ Digitized  %g  data  samples  from  %s’,  N,  filename)); 
axis([offset*Ts  (N+offset- l)*Ts  min(y)  max(y)]);  figure(gcf) 
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%  Compute  and  plot  the  Fourier  transform  of  the  data 
%  Plot  only  positive  half  of  spectrum  since  symmetric  FFT  (real  data) 

Y=fft(y); 

f^[0:((N/2)-l)]/N*Fs; 

subplot(412); 

plot(f,abs(Y(l  :N/2)));  ylabel(‘magnitude’); 

%plot(f,20*logl0(abs(Y(l:N/2))));  ylabel(‘magnitude  in  dB’); 

xlabel(‘ frequency  in  Hz’);  title(‘Fourier  transform  of  the  data’);  figure(gcf) 

ifN<=  1024, 

%  Compute  and  plot  the  Wigner  distribution  provided  data  length  is  small 
WD=wd(y,N); 

WD=WD(N/2+l  :N,:);  %  Since  real,  keep  only  half  the  frequencies 

subplot(413); 

f^[0:N/2-l]/N*Fs/2; 

imagesc(t,fliplr(f),WD);  title(‘Wigner  distribution  of  the  data’) 
ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’);  figure(gcf) 
else 

%  Decimate  to  sampling  frequency  of  8,1 92  Hz  to  sound  as  actually  recorded 
%  Resample  data  at  a  lower  rate  after  lowpass  filtering 
decimation_factor=round(F  s/81 92); 

Dy=decimate(y,decimation_factor); 

DFs=Fs/decimation_factor;  %  DFs=8192; 

DTs=l/DFs; 

DN=N/decimation_factor; 

Dt=[0:(DN)]*DTs; 

subplot(413); 

plot(Dt,  Dy);  xlabel(‘time,  s’);  ylabel(‘amplitude’); 

title(sprintf(‘Data  deeimated  by  a  factor  Fs/8192=%g’,  decimation  factor)); 
figure(gcf) 
end 

%  Compute  and  plot  the  spectrogram  of  the  data 

%  If  length  of  data  is  small,  use  smaller  window  length  for  better  resolution 
%  For  clicks,  use  small  window  length  (e.g.  4  or  8)  since  need  better  time  res. 
subplot(414) 
ifN<=1024, 

SPEC=specgram(y,256,Fs,8,7);  %  built-in  Matlab  program 
else 

SPEC=specgram(y  ,25 6 ,Fs) ; 
end 

f=[0:N/2-l]/N*Fs; 

imagesc(t,f,abs(SPEC));axis  xy;  colormap(jet); 
title(‘ Spectrogram  of  the  data’) 
ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’); 
figure(gcf) 
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A.4  MATLAB  FUNCTION  TO  COMPUTE  THE  PIECEWISE  WIGNER 
DISTRIBUTION  (piece_wd.m) 


function  [WD]=piece_wd(filename,N,decimation_factor,timeoffset) 

%  PIECE_WD  Computes  the  Wigner  distribution  (WD)  of  Nwd  blocks  of  data,  each 
%  block  256  samples  long  such  that  the  total  section  analyzed  is  N  samples. 

%  The  WDs  of  the  various  blocks  are  plotted  consecutively.  The  section 

%  analyzed  starts  at  time  timeoffset.  The  data  is  first  decimated  by  a  factor 

%  decimationfactor. 

%  Antonia  Papandreou-Suppappola  24Feb  1 996 

%  Load  data 

[ydata,header,Fs,Ns]=loadkay(filename); 
disp(‘  ‘);  disp(header);  %  Display  header 

%  Duration  of  digitized  cut 

T=Ns/Fs;  disp(‘  ‘);  disp(sprintf(‘ Signal  duration  in  seconds:  %g’,T)); 

L=256;  %  Length  of  each  block  of  Wigner  distribution 

Nwdmax=floor(Ns/L);  %  Number  of  Wigner  distributions  to  be  computed  for  all  data 

Nwd=N/L;  %  Number  of  Wigner  distributions  of  length  L  that  are  computed 

%  Subtract  dc  value  to  remove  initial  frequencies  due  to  recording  equipment 
ydata=ydata-mean(ydata) ; 

%  Highpass  filter  to  remove  really  low  frequencies  due  to  engine  noise 
b=firl  (40, 0.08, ’high’); 
ydata=conv(ydata,b);  N  data=size(ydata) ; 

ydata=ydata(20:Ns+20-l);  %  since  length  increased  by  40  samples 

%  Decimate  the  data  to  compute  larger  sections  of  WD 
y=decimate(ydata,decimation_factor); 

F  s=Fs/ decimationfactor ;  N  s=N  s/ decimationfactor; 

%  Offset  sample 
offset=floor(timeoffset*Fs/L); 

%  Compute  the  WD  of  each  block  and  piece  it  together 
WD-[]; 

for  i=offset:(Nwd+offset)-l, 

WD=[WD  wd_real(y(i*L+l:(i+l)*L))]; 
end 

%  Plot  piecewise  WD 
twd=[(offset*L+ 1  ):(offset+Nwd)*L]/Fs; 
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fwd=[0:L/2-l]/L*Fs/2;  m2=max(max(WD)); 
imagesc(twd,fliplr(fwd),WD);axis  xy;  colormapO'et); 
title(‘Piecewise  Wigner  distribution  of  the  data’) 
ylabel(‘ffequency,  Hz’);  xlabel(‘time,  s’); 
hold  on 

for  k=0:Nwd-l, 

plot([((offset+k)*L+l)/Fs  ((offset+k)*L+l)/Fs],  [min(fwd)  max(fwd)],’red’) 
end 

hold  off 

%  Plot  data  analyzed 

tsignal=[(offset*L+l):(offset+Nwd)*L]/Fs; 
ysection=y  (offset*  L+ 1 :  (Nwd+offset)  *  L) ; 
plot(tsignal,ysection) ; 

axis([(offset*L+l)/Fs  (offset+Nwd)*L/Fs  min(ysection)  niax(ysection)]) 
title(‘Chosen  section  of  time  signal’);  xlabel(‘time,  s’) 

%  Plot  FT  of  data  analyzed 
K=L*Nwd; 

fsignal=[0:K/2-l]/K*Fs; 

Y=fft(y(offset*L+l:(Nwd+offset)*L)); 
plot(fsignal,abs(Y(l  :K/2))); 

title(‘FT  of  chosen  section’);  xlabel(‘ffequency,  Hz’); 

%  Plot  spectrogram  of  whole  data 

tspec=[l:Ns]/Fs; 

fspec=[0:L/2-l]/L*Fs; 

SPEC=specgram(y,256,Fs); 

imagesc(tspec,fspec,abs(SPEC));axis  xy;  colormap(jet); 

title(‘ Spectrogram  of  the  data’) 

ylabel(‘ frequency,  Hz’);  xlabel(‘time,  s’); 

hold  on; 

plot([(offset*L+l)/Fs  (offset*L+l)/Fs],  [min(fspec)  max(fspec)],’red’); 
plot([(offset*L+Nwd*L)/Fs  (offset*L+Nwd*L)/Fs],  [min(fspec)  max(fspec)],’red’); 
hold  off 

%  Plot  spectrogram  of  section  of  data  only 
tsignal=[(offset*L+l):(offset+Nwd)*L]/Fs; 

H0:L-l]/L*Fs/2; 

ysection=y(offset*L+l:(Nwd+offset)*L); 

SPEC=specgram(ysection,256,Fs,64,23); 
imagesc(tsignal,f,abs(SPEC));axis  xy;  colormapOet); 
title(‘ Spectrogram  of  the  section’) 
ylabel(‘ frequency,  Hz’);  xlabel(‘time,  s’); 
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A.5  MATLAB  FUNCTION  TO  COMPUTE  VARIOUS  QTFRS  OF  A  SECTION  OF  THE 
DATA  (varioustfrs.m) 


function  [WD,SPEC,PWD,SPWD,AD,PAD,power_WD,power_PWD]=various_tffs(filename,N,  offset); 
%  VARIOUS_TFRS  Computes  various  QTFRs  of  a  block  of  data  of  length  N  starting  at  the 
%  offset.  The  QTFRs  include  the  Wigner  distribution  (WD),  the  spectrogram 

%  (SPEC),  the  pseudo  Wigner  distribution  (PWD),  the  smoothed  pseudo 

%  Wigner  distribution  (SPWD),  the  Altes  Q-distribution  (AD),  the  pseudo 

%  Altes  Q-distribution  (PAD),  the  power  Wigner  distribution  (power_WD), 

%  and  the  power  pseudo  Wigner  distribution  (power  PWD). 

%  Antonia  Papandreou-Suppappola  24April  1 996 

[y,header,Fs,Ns]=loadkay(filename,N,offset); 
disp(‘  ‘);  disp(header); 

%  Duration  of  digitized  cut 
T=Ns/Fs; 

disp(‘  ‘);  disp(sprintf(‘ Signal  duration  in  seconds:  %g’,T)); 

%  Subtract  dc  offset  due  to  recording  equipment 
%y=y-mean(y); 

%  Highpass  filter  to  remove  really  low  frequencies  due  to  engine  noise 
b=firl(40,0.05,’high’); 

ysection=conv(y,b);  ysection=ysection(20:N+20-l); 

tsignal=[(offset+l):(offset+N)]/Fs; 

%  WD  of  section 
f^[0:N/2-l]/N*Fs/2; 

WD=wd(ysection);  WD=flipud(WD); 

WD=WD(N/2+l  :N,:);  %  Since  real,  keep  only  half  the  frequencies 

imagesc(tsignal,f,WD);axis  xy;  colormap(jet); 
title(sprintf(‘WD,  filename  =  %s’,  filename)) 
ylabel(‘ frequency,  Hz’);  xlabel(‘time,  s’); 

%  Spectrogram  of  section 
f=[0:N/2-l]/N*Fs/2; 

SPEC=spec(ysection,hamming(  1 1  ),N); 

SPEC=flipud(SPEC);  SPEC=SPEC(N/2+l  :N,:); 
imagesc(tsignal,f,abs(SPEC));axis  xy;  colormap(jet); 
title(sprintf(‘SPEC,  filename  =  %s’,  filename)) 
ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’); 
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%  Pseudo  WD  of  section 
f^[0;N/2-l]/N*Fs/2; 

PWD=pwd(ysection,hanning(  1 1  ),N); 

PWD=flipud(PWD);  PWD=PWD(N/2+ 1  :N, :);  %  Since  real,  keep  only  half  the  frequencies 

imagesc(tsignal,f,PWD);axis  xy;  colormap(jet); 
title(sprintf(‘PWD,  filename  =  %s’,  filename)) 
ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’); 

%  Smoothed  Pseudo  WD  of  section 
t^[0:N-l]/N*Fs/2; 

SPWD=spwd(ysection,hanning(l  l),blackman(33),N); 

SPWD-flipud(SPWD); 

SPWD=SPWD(N/2+l  :N,:);  %  Since  real,  keep  only  half  the  frequencies 
imagesc(tsignal,f,SPWD);axis  xy;  colormap(jet); 
title(sprintf(‘SPWD,  filename  =  %s’,  filename)) 
ylabel(‘ frequency,  Hz’);  xlabel(‘time,  s’); 

%  Altes  Q-distribution  of  section 

input=hilbert(ysection);  input=input.’;  %  to  make  data  analytic 
[warptimesig,  M,  df,  dv,  length2]=prewarp_HC(input);  %  warp  data 
gg=wd(warptimesig.’);  %  compute  WD  of  warped  data 
AD=unwarp_HC(gg,  M,  df,  dv,  length2);  %  transform  time  and  frequency  axes 
f=[0:N-l]/N*Fs/2; 

imagesc(tsignal,f,flipud(AD));axis  xy;  colormap(jet); 
title(sprintf(‘AD,  filename  =  %s’,  filename)) 
ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’); 

%  Pseudo  Altes  Q-distribution  of  section 

inpufr=hilbert(y section);  input=input.’;  %  to  make  data  analytic 

[warptimesig,  M,  df,  dv,  length2]=prewarp_HC(input);  %  warp  data 
gg=pwd(warptimesig.’,hamming(l  l),length2);  %  compute  PWD  of  warped  data 
PAD=unwarp_HC(gg,  M,  df,  dv,  length2);  %  transform  time  and  frequency  axes 
fr=[0:N-l]/N*Fs/2; 

imagesc(tsignal,f,flipud(PAD));axis  xy;  colormap(jet); 
title(sprintf(‘PAD,  filename  =  %s’,  filename)) 
ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’); 

%  Power  WD  of  section 
input=hilbert(ysection) ;  input=input.  ’ ; 
kappa=2;  %  kappa>l 

[warptimesig,  M,  df,  dv,  length2]=prewarp_PC(input, kappa);  %  warp  data 

gg=wd(warptimesig.’);  %  compute  WD  of  warped  data 

power_WD=unwarp  PC(gg,  M,  df,  dv,  length2,kappa);  %  transform  time  and  frequency  axes 
f=[0:N-l]/N*Fs/2; 

imagesc(tsignal,f,flipud(power_WD));axis  xy;  colormap(jet); 
title(sprintf(‘Power  %g  WD,  filename=%s’,  kappa,  filename)) 
ylabel(‘ frequency,  Hz’);  xlabel(‘time,  s’); 
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%  Power  pseudo  WD  of  section 
input=hilbert(ysection);  input=input. 
kappa=2;  %  kappa>l 

[warptimesig,  M,  df,  dv,  length2]=prewarp_PC(input, kappa);  %  warp  data 
gg=pwd(warptimesig.’,hamming(l  l),length2);  %  compute  PWD  of  warped  data 

power_PWD=unwarp_PC(gg,  M,  df,  dv,  length2, kappa);  %  transform  time  and  frequency  axes 
f=[0:N-l]/N*Fs/2; 

imagesc(tsignal,f,flipud(power_PWD));axis  xy;  colormap(jet); 
title(sprintf(‘ Power  %g  pseudo  WD,  filename=%s’,  kappa,  filename)) 
ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’) 
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A.6  MATLAB  FUNCTION  TO  COMPUTE  THE  SMOOTHED  PSEUDO-WIGNER 
DISTRIBUTION  USING  A  DIRECT  METHOD  (spwd_dir.in) 


function  [SPWD]  =  spwd_dir(x,wf,wt) 

%  SPWD  DIR  Computes  the  smoothed  pseudo  Wigner  distribution  (SPWD)  of  the  time 
%  domain  signal  x  using  a  frequency  window  wf,  and  a  time  window  wt.  On 

%  exit,  the  SPWD  is  an  M  x  length(x)  real  matrix,  where  M  is  the  length  of 

%  the  longest  window.  The  local  autocorrelation  function  (LAF)  of  x 

%  is  first  computed  and  then  convolved  with  the  time-smoothing  window  wt. 

%  The  LAF  is  updated  so  that  the  previous  value  can  be  used  to  compute  the 

%  next  value.  The  result  of  the  convolution  then  gets  multiplied  with  the 

%  squared  magnitude  of  the  frequency-smoothing  window,  and  then  the 

%  Fourier  transform  of  the  product  is  obtained.  Note  that  the  length  of  the 

%  windows  must  be  smaller  than  that  of  the  signal,  and  the  length  of  wt  must 

%  be  even. 


%  Byeong-Gwan  lem,  2 1  June  1 996;  Antonia  Papandreou-Suppappola,  27June  1 996 


[rows  cols]  =  size(x); 

if  cols  ~=  1,  x=x.’;  end  %  make  column  vector 


N=length(wt);  %  size  of  window  for  time  smoothing 

M=length(wf);  %  size  of  window  for  frequency  smoothing 

Lshift=(N-l);  %  shifting  amount  for  compensating  the  time  delay  in  resulting  QTFR 

M1=M; 

x=[x;zeros(Lshift,  1 )] ; 

ndata=length(x);  %  Number  of  samples  in  the  time  domain 
wt=flipud(wt);  %  Time  reversal  of  the  time  domain  window 

if  M  >  N  %  Relocation  of  frequency  smoothing  window 
wl^[wf(fix(M/2)+l  :M,l);wf(l  :fix(M/2),l)]; 
else 

wf^[wf(fix(M/2)+l  :M,  1  );zeros(N-M,  1  );wf(l  ;fix(M/2),  1 )]; 

M1=N; 

end 

for  k=l:N, 

wtl(:,k)=wt;  %  Construction  of  window  matrix  by  stacking  the  window  vector 
end 

L=N/2; 

q=zeros(N,N);  %  Initialization  of  a  block  of  LAF 
xl=zeros(N,l);  %  Initialization  of  a  block  of  data 
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for  k=l  :ndata, 

q(2:N,:)=q(l  :N-1 %  Shifting  of  LAF  for  updating  it 
%  Calculation  of  local  correlation  vector  at  current  time  index 
xl=[x(k,l);xl(l:(N-l),l)]; 
x2=[zeros(L- 1 , 1  );x  1  ;zeros(L+ 1,1)]; 
q(l,2:N)=conj(x2(L:l:(N+L-2))’).*x2((N+L-2);-l:L)’; 

%  Time  domain  smoothing 

ql=conj(sum(q.*wtl)’);  %  Convolution  at  current  time  index 
if  N  <  M, 

q  1 =[q  1  (fix(N/2)+ 1  :N,  :);zeros((M-N),  1  );q  1  ( 1  :fix(N/2),:)] ; 
else 

ql=[ql(fix(N/2)+l:N,:);ql(l:fix(N/2),;)]; 

end 

%  Frequency  domain  smoothing 
ql  =  ql.*(wf^2); 

C  =  2*fft(ql);  C  =  real(fftswap(C)); 

Cl(:,k)  =  C(Ml/2+l:Ml);  %  To  only  keep  the  high  frequencies  for  real  data 
end 

Cl=[Cl(:,Lshift:ndata)  Cl(:,l  :(Lshift-l))];  %  compensating  for  the  time  delay 
SPWD=[C1  (;,  1  :(ndata-Lshift))];  %  compensating  for  the  end  effect 
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A.7  MATLAB  FUNCTIONS  TO  COMPUTE  THE  SMOOTHED  PSEUDO-WIGNER 
DISTRIBUTION  USING  THE  SHORT-TIME  TECHNIQUE 


A.  7.1  Function  stafcetm 


function  [SPWD] 
%  STAFCET 

% 

% 

% 

% 

% 

% 

% 

% 

% 


stafcet(filenanie,deciniation_factor,timeoffset,L,freq_window,time_window) 
Computes  the  short-time  smoothed  pseudo-Wigner  distribution  (SPWD) 
using  the  short-time  ambiguity  function  (STAF).  It  computes  the 
ambiguity  function  of  the  windowed  data,  multiplies  it  with  the  fixed 
kernel,  takes  two-dimensional  FFT,  and  keeps  only  the  time-slice  at  the 
center  of  the  windowed  data  for  the  resulting  QTFR.  The  resulting  QTFR 
has  dimensions  (Nw  x  N/Nw)  where  N  is  the  data  length,  and  Nw  is  length 
of  the  rectangular  window  used  to  segment  the  data.  The  sound  data  is 
decimated  by  a  factor  decimation_factor.  The  data  starts  at  time  timeoffset 
and  has  length  L.  The  frequency  window  of  the  SPWD  is  freq_window, 
and  the  time  window  of  the  SPWD  is  time  window. 


%  Antonia  Papandreou-Suppappola,  17Julyl996 

[y, header, F  s  ,N  s]  =loadkay  (filename) ; 
disp(‘  ‘);  disp(header); 

T=Ns/Fs;  %  Duration  of  digitized  cut 

disp(‘  ‘);  disp(sprintf(‘ Signal  duration  in  seconds:  %g’,T)); 

t_y=[l:Ns]/Fs;  plot(t_y,  y); 

%  Highpass  filter  to  remove  low  frequencies  due  to  engine  noise 
b=firl(40,0.05,’high’); 

y=conv(y,b);  y=y(20:Ns+20-l);  %  since  length  of  y  increased  by  40  samples 

ydec=decimate(y,decimation_factor);  %  Decimate  the  data  to  reduce  number  of  samples 
Fsdec=Fs/decimation_factor;  Nsdec=length(ydec); 

%t_ydec=[l  :Nsdec]/Fsdec;  plot(t__ydec,ydec);  specgram(ydec,256,Fsdec); 

%  Choose  a  section,  starting  at  timeoffset  secs 
offset=floor(timeofFset*Fsdec); 
tsection=[(ofFset+ 1  ):offset+L]/Fsdec; 

ysection=ydec(offset+l  :offset+L);  %  From  decimated  signal 
duration_signal_analyzed=max(tsection)-min(tsection); 

%  Start  computation  of  short-time  SPWD 
Nw=256;  %  length  of  windowed  signal 
time_sections=floor(L/Nw);  %  number  of  signal  sections 
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%  examples  of  windows  for  SPWD 

%fTeq_window=hamming(  140);  time_window=blackman(  121); 

%ker_WD=ones(Nw,Nw);  %  Compute  Wigner  distribution  kernel 
ker_SPWD=kerspwd(Nw,Nw,ffeq_window,time_window);  %  Compute  SPWD  kernel 

for  i=l  :time_sections, 

ywindow=ysection((i-l)*Nw+l  :i*Nw);  %  shift  rectangular  window  to  segment  data 

twindow=(i-l)*Nw+l  :i*Nw;  %  section  of  time  covered 

%  plot(twindow,abs(ywindow)) 

AFwindow=af(ywindow);  %  Compute  AF  of  windowed  data,  STAF 
AFproduct=AFwindow.*ker_SPWD;  %  Obtain  STAF  and  kernel  product 
SPWD(;,i)=aftowd_section(AFproduct);  %  Compute  2-D  FFT  and  keep  only  center  time  of 
window 
end 

SPWD=SPWD(Nw/2+l  :Nw,:);  %  For  real  data,  keep  only  positive  frequencies 

f=[0:Nw/2-l]/Nw*Fsdec/2;  %  frequency  range  depending  on  Fsdec 

t=Nw/2:Nw:L-Nw/2;  %  each  time  slice  is  the  center  of  the  xsection 

%  To  obtain  an  image  plot  of  the  SPWD 

m2=max(max(SPWD));  contour(tsection,f,flipud(SPWD),[0. 1  *m2:0.2*m2;0.9*m2]); 
imagesc(tsection,fliplr(f),flipud(SPWD));axis  xy;  colormap(jet); 

title(sprintf(‘SPWD,  filename  =  %s’,  filename));  ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’); 


A.  7.2  Function  kerspwd.m 


function  [ker]=kerspwd(Ntau,Nnu,h,g) 

%  KERSPWD  Computes  the  smoothed  pseudo-Wigner  distribution  (SPWD)  kernel  in  the 
%  ambiguity  domain,  ker  is  a  Nnu  x  Ntau  matrix  with  entries  given  by 

ker(v,g)  = 

%  n(g/2)  *  conj(n(-g/2))  *  G(v)  where  G(v)  represents  the  Fourier  transform 

of 

%  window  function  g,  and  h  is  a  window  function. 

%  Antonio  H.  Costa,  March  13,  1993 

ker=kerpwd(Ntau,Nnu,h);  Lx=Ntau;  Lg=length(g);  Ng=(Lg  -  l)/2;  Nx=Lx  /  2; 
gx=zeros(Lx,l);  gx(l:Ng+l)=g(Ng+l:Lg);  gx(Lx-Ng+l:Lx)=g(l:Ng);  gg=fft(gx); 
gg=fftshift(gg); 
for  i  =  1  :Ntau, 
ker(:,i)  =  ker(:,i).*gg; 
end 
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A.  7.3  Function  kerpwd.m 


function  [ker]=kerpwd(Ntau,Nnu,h) 

%  KERPWD  Computes  the  pseudo  Wigner  distribution  (PWD)  kernel  in  the  ambiguity 

%  domain,  ker  is  a  Nnu  x  Ntau  matrix  with  entries  given  by  ker(v,g)  =  n(g/2) 

%  *  conj’(n(-g/2)).  h  is  the  window  function. 

%  Antonio  H.  Costa,  March  13, 1993 

h  =  h  . *  conj(h(length(h):- 1 : 1 ));  Lh  =  length(h);  Nh  =  round((Lh  - 1 )  /  2);  Nx  =  Ntau  /  2; 
hx  =  zeros(Ntau,l);  hx(Nx  -  Nh  +  1  :Nx  -  Nh  +  Lh)  =  h;  ker  =  zeros(Nnu,Ntau); 
for  j  =  l:Nnu, 
ker(j,:)  =  hx.’; 
end 


A.  7.4  Function  af.m 


function  [A]=af(x,NF) 

%  AF  This  function  calculates  the  ambiguity  function  (AF)  of  the  signal  x  where  x  must  be  a 
%  column  vector.  A  is  a  complex  NF  x  N  matrix,  where  N=length(x).  If  NF  is  not  specified, 

%  then  NF=N,  provided  that  N  is  even;  otherwise,  NF=N+1 .  For  speed  purposes,  NF  should 

%  be  a  power  of  2! 

%  Antonio  H.  Costa,  March  13,  1993 

N  =  length(x);  [rows  cols]  =  size(x); 
if  cols  ~=  1,  x=x.’;  end 
if  nargin  ==  1 , 
if  rem(N,2)  ==  1, 

NF  =  N+  1; 
else 

NF  =  N; 
end 

elseif  nargin  =  2, 
if  rem(N,2)  =  0, 
ifNF<N, 

error(‘Number  of  FFT  points  must  be  >=  to  the  length(x)’); 
end 
else 

ifNF<N, 

error(‘Number  of  FFT  points  must  be  larger  than  length(x)’); 
end 
end 
else 
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error(‘Only  1  or  2  arguments  allowed!...’); 
end 

M  =  N-  l;L  =  N/2;x  =  [zeros(L-l,l);x;zeros(L+l,l)];  q  =  zeros(N,N); 
forj  =  1:N, 

q(2:N,j)  =  xO'  :lVl+j-l).*conj(x(M+j-l  :-l  :j)); 
end 

q=[q(:,N/2+l;N)  zeros(N,NF-N)  q(:,l:N/2)]; 

A  =  2  *  fft(q.’,NF);  A  =  fftswap(A);  A  =  flipud(A); 


A.  7.5  Function  aftowd_section.m 


function  [Wsection]  =  aftowd_section(A) 

%  AFTOWD_SECTION  Computes  the  two-dimensional  FFT  of  A  for  only  one  time  slice. 
%  The  matrix  A  should  be  viewed  as  having  "time  lag"  increasing  as 

%  one  progresses  along  each  column,  with  column  1  representing 

%  time  lag  -K  where  K  equals  the  number  of  columns  of  A. 

%  Antonia  Papandreou-Suppappola,  7Julyl996 
%  Antonio  H.  Costa,  March  13,  1993 

[rows  cols]  =  size(A); 
if  rows  >  cols, 
k  =  (rows  -  cols)  /  2; 

A  =  [zeros(rows,k)  A  zeros(rows,k)]; 
end 

A  =  fftshift(A); 

A  =  fft(A.’,rows);  %  Take  FT  of  each  row 
A  =  A.’;  A  =  flipud(A); 

W  =  ifft(A);  W  =  W.’;  W  =  flipud(W);  W  =  real(W); 

W  =  fftswap(W);  W  =  fftswap(W.’);  W  =  W.’; 
if  rows  >  cols, 

W  =  W(:,k+l:k+cols); 
end 

W=fftswap(flipud(W)); 

Wsection=W(:,cols/2);  %  keeping  only  the  center  time  slice 


A-22 


A.8  MATLAB  FUNCTION  TO  COMPUTE  THE  SMOOTHED  PSEUDO-ALTES  Q- 
DISTRIBUTION  USINGTHE  SHORT-TIME  TECHNIQUE  (hcstafcetm) 


function  [SPAD]  =  hcstafcet(filename,decimation_factor,timeoffset,L,ffeq_window,time_window) 

%  HCSTAFCET  Computes  the  hyperbolic  class  smoothed  pseudo-Altes  Q-distribution  (SPAD) 

%  by  hyperbolically  warping  the  signal,  computing  Cohen’s  class  smoothed 

%  pseudo-Wigner  distribution  (SPWD)  of  the  warped  signal,  and  transforming  the 

%  axes  for  correct  time-frequency  localization.  When  the  axes  are  transformed,  only 

%  the  time-slice  at  the  center  of  the  windowed  data  for  the  resulting  QTFR  is 

%  kept.  The  resulting  SPAD  has  dimensions  (Nw  x  N/Nw)  where  N  is  the  length  of 

%  the  data,  and  Nw  is  length  of  the  rectangular  window  used  to  segment  the  data. 

%  The  sound  data  is  decimated  by  a  factor  decimation  factor.  The  data  to  be 

%  analyzed  starts  at  time  timeoffset  and  has  length  L.  The  frequency  window  of  the 

%  SPWD  is  freq_window,  and  the  time  window  of  the  SPWD  is  time  window. 

%  Antonia  Papandreou-Suppappola,  27Julyl996 

[y,header,Fs,Ns]=loadkay(filename); 
disp(‘  ‘);disp(header); 

%  Duration  of  digitized  eut 

T=Ns/Fs;  disp(‘  ‘);  disp(sprintf(‘ Signal  duration  in  seconds:  %g’,T)); 

%t_y=[l  :Ns]/Fs;  plot(t_j,  y); 

%  Highpass  filter  to  remove  low  frequencies  due  to  engine  noise 
b=firl  (40,0. 05, ’high’); 

y=conv(y,b);  y=y(20:Ns+20-l);  %  since  length  of  y  increased  by  40  samples 

%  Decimate  the  data  to  reduce  number  of  samples 
ydec=decimate(y,decimation_factor); 

Fsdec=Fs/decimation_factor;  Nsdec=length(ydec); 

%t_ydec=[l  :Nsdec]/Fsdec;  plot(t_ydec,ydec);  specgram(ydec,256,Fsdec); 

%  Choose  a  section,  starting  at  timeoffset  secs 
offset=floor(timeofFset*Fsdec); 
tsection=[(offset+ 1 )  :offset+L]/Fsdec; 

ysection=ydec(offset+l  :offset+L);  %  From  decimated  signal 
duration_signal_analyzed=max(tsection)-min(tsection); 

%  Start  computation  of  short-time  SPAD 
Nw=128;  %  length  of  windowed  signal 

time_sections=floor(L/Nw);  %  Number  of  signal  sections 

%  Examples  of  SPWD  windows  of  warp  signal 
%freq_window=hamming(l  58);  time_window=blackman(145); 
ker_SPWD=kerspwd(Nw*4,Nw*4,freq_window,time_window); 

for  i=l  :time_sections, 

ywindow=ysection((i- 1  )*Nw+l  :i*Nw);  %  shift  rectangular  window  to  segment  data 

%twindow=(i-l)*Nw+l:i*Nw;  plot(twindow,abs(3rwindow));  %  section  of  time  covered 
[warptimesig,  M,  df,  dv,  length2]=prewarp_HC(ywindow.’);  %  warp  signal 
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%  gg=wd(warptimesig);  %  For  Altes  Q-distribution 

mm=af(warptimesig);C=mni.*ker_SPWD;gg=aftowd(C);  %  SPWD  of  warped  signal 
%  Keep  positive  frequencies  and  center  time,  and  transform  axes 
SPAD(:,i)=unwarp_HCsection(gg,  M,  df,  dv,  length2); 
end 

l=[0:Nw/2-l]/Nw*Fsdec/2;  %  frequency  range  depending  on  Fsdec 
imagesc(tsection,f,flipud(SPAD));axis  xy;  colormap(jet); 
title(sprintf(‘SPAD,  filename  =  %s’,  filename)) 

ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’);  %  Max  frequency  is  Fs/dec  factor/4 
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A.9  MATLAB  FUNCTION  TO  COMPUTE  THE  k  TH  POWER  SMOOTHED  PSEUDO- 
WIGNER  DISTRIBUTION  USING  THE  SHORT-TIME  TECHNIQUE  (pcstafcetm) 


function  [PSPWD]  =  pcstafcet(filename, decimation_factor,timeoffset,L, kappa, freq_window,time_window) 

%  PCSTAFCET  Computes  the  kappa  power  smoothed  pseudo-Wigner  distribution  (PSPWD)  of 

%  the  power  class  by  power  warping  the  signal,  computing  Cohen’s  class  smoothed 

%  pseudo-Wigner  distribution  (SPWD)  of  the  warped  signal,  and  transforming  the 

%  axes  for  correct  time-frequency  localization.  When  the  axes  are  transformed,  only 

%  the  time-slice  at  the  center  of  the  windowed  data  for  the  resulting  QTFR  is  kept. 

%  The  resulting  PSPWD  has  dimensions  (Nw  x  N/Nw)  where  N  is  the  length  of  the 

%  data,  and  Nw  is  the  length  of  the  rectangular  window  used  to  segment  the  data. 

%  The  sound  data  is  decimated  by  a  factor  decimation_factor.  The  data  to  be 

%  analyzed  starts  at  time  timeoffset  and  has  length  L.  The  frequency  window  of  the 

%  SPWD  is  freq_window,  and  the  time  window  of  the  SPWD  is  time  window. 

%  Antonia  Papandreou-Suppappola,  170ctl996 

[y,header,Fs,Ns]=loadkay(filename); 
disp(‘  ‘);  disp(header); 

%  Duration  of  digitized  cut 

T=Ns/Fs;  disp(‘  ‘);  disp(sprintf(‘ Signal  duration  in  seconds;  %g’,T)); 

%t_y=[l  :Ns]/Fs;  plot(t_j,  y); 

%  Highpass  filter  to  remove  low  frequencies  due  to  engine  noise 
b=firl(40,0.05,’high’); 

y=conv(y,b);  y=y(20:Ns+20-l);  %  since  length  of  y  increased  by  40  samples 

%  Decimate  the  data  to  reduce  number  of  samples 
ydec=decimate(y,decimation_factor); 

Fsdec=Fs/decimationfactor;  Nsdec=length(ydec); 

%t_ydec=[l  :Nsdec]/Fsdec;  plot(t_ydec,ydec);  specgram(ydec,256,Fsdec); 

%  Choose  a  section,  starting  at  timeoffset  secs 
offset=f]oor(timeoffset*F  sdec); 
tsection=[(offset+l):offset-t-L]/Fsdec; 

ysection=ydec(offset+l  :offset+L);  %  From  decimated  signal 
duration_signal_analyzed=max(tsection)-min(tsection); 

%  Start  computation  of  short-time  power  smoothed  pseudo  Wigner  distribution 
Nw=128;  %  length  of  windowed  signal 

time_sections=floor(L/Nw);  %  Number  of  signal  sections 

%  Examples  of  SPWD  windows  of  warp  signal 
%freq_window=hamming(  158);  time_window=blackman(  1 45); 
ker_SPWD=kerspwd(Nw*4,Nw*4,freq_window,time_window); 

for  i=l;time_sections, 

ywindow=ysection((i-l)*Nw-i-l  :i*Nw);  %  shift  rectangular  window  to  segment  data 

%twindow=(i-l)*Nw+l:i*Nw;  plot(twindow,abs(ywindow));  %  section  of  time  covered 
[warptimesig,  M,  df,  dv,  length2]=prewarp_PC(ywindow.’,kappa);  %  warp  signal 
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%  gg=wd(warptimesig);  %  For  power  Wigner  distribution 

mm=af(warptimesig);C=mm.*ker_SPWD;gg=aftowd(C);  %  SPWD  of  warped  signal 
%  Keep  positive  frequencies  and  center  time,  and  transform  axes 
PSPWD(:,i)=unwarp_PCsection(gg,  M,  df,  dv,  length2); 
end 

t=[0:Nw/2-l]/Nw*Fsdec/2;  %  frequency  range  depending  on  Fsdec 
imagesc(tsection,f,flipud(PSPWD));axis  xy;  colormap(jet); 
title(sprintf(‘PSPWD,  filename  =  %s’,  filename)) 

ylabel(‘frequency,  Hz’);  xlabel(‘time,  s’);  %  Max  frequency  is  Fs/dec_factor/4 
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